On the tensor spectral p-norm and its dual norm via partitions

This paper presents a generalization of the spectral norm and the nuclear norm of a tensor via arbitrary tensor partitions, a much richer concept than block tensors. We show that the spectral p-norm and the nuclear p-norm of a tensor can be lower and upper bounded by manipulating the spectral p-norms and the nuclear p-norms of subtensors in an arbitrary partition of the tensor for 1≤p≤∞\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1\le p\le \infty$$\end{document}. Hence, it generalizes and answers affirmatively the conjecture proposed by Li (SIAM J Matrix Anal Appl 37:1440–1452, 2016) for a tensor partition and p=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p=2$$\end{document}. We study the relations of the norms of a tensor, the norms of matrix unfoldings of the tensor, and the bounds via the norms of matrix slices of the tensor. Various bounds of the tensor spectral and nuclear norms in the literature are implied by our results.


Introduction
The spectral p-norm of a tensor generalizes the spectral p-norm of a matrix. It can be defined by the L p -sphere constrained multilinear form optimization problem: Research of the first author was supported in part by National Natural Science Foundation of China (Grants 61772442 and 11671335) where ‖T‖ p denotes the spectral p-norm of a given tensor T = t i 1 i 2 …i d ∈ ℝ n 1 ×n 2 ×⋯×n d , is a multilinear form of (x 1 , x 2 , … , x d ) , and ‖ ⋅ ‖ p denotes the L p -norm of a vector for 1 ≤ p ≤ ∞ . When the order of the tensor T is two, the problem is reduced to the spectral p-norm of a matrix, and in particular when p = 2 , to the spectral norm or the largest singular value of a matrix. The spectral p-norm of a tensor was proposed by Lim [18] in terms of singular values of a tensor, and is closely related to the largest Z-eigenvalue (for the case p = 2 ) of a tensor proposed by Qi [24].
The matrix spectral p-norm is evidently important in many branches of mathematics as well as in various practical applications; see e.g., [6,11]. The complexity and approximation methods of the matrix spectral p-norm were studied extensively [1,21,27], and they have particular applications in robust optimization [27]. When p = 1, 2 , the matrix spectral p-norm can be computed easily, and when 2 < p ≤ ∞ , computing the matrix spectral p-norm is NP-hard, while it remains unknown for the rest of p. The tensor spectral p-norm was studied mainly in approximation algorithms of polynomial optimization [15]. When the order of a tensor is larger than two, computing the tensor spectral norm ( p = 2 ) is already NP-hard proved by He et al. [8] (see also [10]), a sharp contrast to the case of matrices. NP-hardness to compute the tensor spectral p-norm was also established when 2 < p ≤ ∞ by Hou and So [12]. Various approximation bounds of the tensor spectral p-norm were established in the literature [7-9, 12, 26]. Nikiforov [23] studied the tensor spectral p-norm using combinatorial methods and proposed several bounds. Li and Zhao [17] recently studied a more general tensor spectral p-norm and provided upper bounds via norm compression tensors.
The dual norm to the spectral p-norm of a tensor T , called the nuclear p-norm, is defined as ‖T‖ p * = max ‖X‖ p ≤1 ⟨T, X⟩ . In the case of matrices and p = 2 , it is reduced to the nuclear norm of a matrix, which is equal to the sum of all the singular values of a matrix. The matrix nuclear norm was used widely as a convex envelope of the matrix rank for many rank minimization problems, such as matrix completion [2]. Friedland and Lim [4] studied the tensor nuclear p-norm systematically, and showed that computing the tensor nuclear norm ( p = 2) is NP-hard when the order of the tensor is larger than two. They also proposed simple lower and upper bounds of the tensor spectral norm and nuclear norm. The study on the tensor nuclear p-norm has been mainly focused on the case p = 2 , such as tensor completion [5,20,30]. Derksen [3] discussed the nuclear norm of various tensors based on orthogonality. Nie [22] studied symmetric tensor nuclear norms. Extremal properties of the tensor spectral norm and nuclear norm were studied in [16].
Most of the methods to tackle the tensor spectral p-norm and nuclear p-norm in the literature have been heavily relying on matrix unfoldings, no matter in theory On the tensor spectral p-norm and its dual norm via partitions such as approximation methods [15] and in practice such as tensor completion [5]. Hu [13] established the relation of the tensor nuclear norm to the nuclear norms of its matrix unfoldings. Wang et al. [29] systematically studied the tensor spectral p-norm via various matrix unfoldings and tensor unfoldings. Li [14] proposed a novel approach to study the tensor spectral norm and nuclear norm via tensor partitions, a concept generalizing block tensors by Ragnarsson and Van Loan [25]. Some neat bounds of the tensor spectral norm (respectively, nuclear norm) via the spectral norms (respectively, nuclear norms) of subtensors in any regular partition were proposed, and a conjecture [14,Conjecture 3.5] on the bounds in any tensor partition was proposed.
In this paper, we systematically study the tensor spectral p-norm and nuclear p-norm via the partition approach in [14]. We prove that for the most general partition called arbitrary partition, the bounds of the tensor spectral p-norm and nuclear p-norm via subtensors can be established for any 1 ≤ p ≤ ∞ . It generalizes and answers affirmatively the Li's conjecture, which is the case p = 2 for a tensor partition. The novelty of the proof lies in establishing an index system to describe subtensors in an arbitrary partition. Based on these, we study the relations of the spectral p-norm of a tensor, the spectral p-norms of matrix unfoldings of the tensor, and the bounds via the spectral p-norms of matrix slices of the tensor. The same relation is studied for the tensor nuclear p-norm. Various bounds of these tensor norms in the literature can be derived from our results. This paper is organized as follows. We start with the preparation of various notations, definitions and properties of tensor norms and tensor partitions in Sect. 2. In Sect. 3, we present our main result on bounding the tensor spectral p-norm and nuclear p-norm via partitioned subtensors. Section 4 is devoted to the discussion and theoretical applications, particularly on the relations among the tensor norms, the norms of matrix unfoldings, and the norms via matrix slices.

Preparation
Throughout this paper, we uniformly use the lower case letters (e.g., x), the boldface lower case letters (e.g., x = x i ), the capital letters (e.g., X = x ij ), and the calligraphic letters (e.g., X = x i 1 i 2 …i d ) to denote scalars, vectors, matrices, and higher order (order three or more) tensors, respectively. Denote ℝ n 1 ×n 2 ×⋯×n d to be the space of dth order real tensors of dimension n 1 × n 2 × ⋯ × n d . The same notations apply for a vector space and a matrix space when d = 1 and d = 2 , respectively. Denote ℕ to be the set of positive integers.
Given a dth order tensor space ℝ n 1 ×n 2 ×⋯×n d , we denote k ∶= 1, 2, … , n k to be the index set of mode-k for k = 1, 2, … , d . Trivially, 1 × 2 × ⋯ × d becomes the index set of the entries of a tensor in the tensor space. The Frobenius inner product of two tensors U, V ∈ ℝ n 1 ×n 2 ×⋯×n d is defined as: Its induced Frobenius norm is naturally defined as ‖T‖ 2 ∶= √ ⟨T, T⟩ . When d = 1 , the Frobenius norm is reduced to the Euclidean norm of a vector. In a similar vein, we may define the L p -norm of a tensor (also known as the Hölder p-norm) for 1 ≤ p ≤ ∞ to looking at a tensor as a vector, as follows: A rank-one tensor, also called a simple tensor, is a tensor that can be written as outer products of vectors, i.e., It can be equivalently represented by the entries as: Here is a property of the L p -norm of a rank-one tensor.
Proof According to (2), we have ◻

The spectral p-norm and nuclear p-norm
Let us formally define the tensor spectral p-norm and its dual norm.

Definition 2.2
For a given tensor T ∈ ℝ n 1 ×n 2 ×⋯×n d and 1 ≤ p ≤ ∞ , the spectral p-norm of T , denoted by ‖T‖ p , is defined as Essentially, ‖T‖ p is the maximal value of the Frobenius inner product between T and a rank-one tensor whose L p -norm is one, according to Proposition 2.1. We remark that ⟨ T, (1). Hence, as mentioned in Sect. 1, the tensor spectral p-norm is more commonly known as the L p -sphere constrained multilinear form optimization problem in the optimization community. When p = 2 , the tensor spectral p-norm is often called the tensor spectral norm, and is also known to be the largest singular value of the tensor [18].

3
On the tensor spectral p-norm and its dual norm via partitions Definition 2.3 For a given tensor T ∈ ℝ n 1 ×n 2 ×⋯×n d and 1 ≤ p ≤ ∞ , the nuclear p-norm of T , denoted by ‖T‖ p * , is defined as The decomposition of T into a sum of rank-one tensors, such as that in (4), is called a rank-one decomposition of T . Therefore, the tensor nuclear p-norm is the minimum of the sum of the L p -norms of rank-one tensors in any rank-one decomposition. A rank-one decomposition of T that attains ‖T‖ p * is called a nuclear p-decomposition of T , similar to the nuclear decomposition of a tensor for p = 2 discussed in [4]. When p = 2 , the tensor nuclear p-norm is commonly known as the tensor nuclear norm. The tensor nuclear norm is the convex envelope of the tensor rank and is widely used in tensor completion [30].
We provide some basic facts of the tensor spectral p-norm and nuclear p-norm. The proof is essentially based on the Hölder's inequality.

Proposition 2.4 For any
The tensor nuclear p-norm is the dual norm to the tensor spectral p-norm, and vice versa, for any 1 ≤ p ≤ ∞.

Lemma 2.5
For given tensors T and Z in a same tensor space and 1 ≤ p ≤ ∞ , it follows that and further ⟨T, Z⟩, ⟨T, Z⟩.
⟨T, We remark that the proof of Lemma 2.5 for p = 2 can be found in [3,19]. When d = 2 , the tensor spectral p-norm and nuclear p-norm are reduced to the matrix spectral p-norm and nuclear p-norm, respectively. When d = 1 , a vector, its spectral p-norm is the L q -norm where 1 p + 1 q = 1 and its nuclear p-norm is the L p -norm, as mentioned in Proposition 2.4. Two extreme cases of these norms worth mentioning, and they are the only known easy cases to compute.
On the other hand, denote e i to be the vector whose ith entry is one and others are zeros. Clearly ‖e i ‖ 1 = 1 , and we have implying that ‖T‖ 1 ≥ �t s 1 s 2 …s d � = ‖T‖ ∞ . Therefore, ‖T‖ 1 = ‖T‖ ∞ , and the other identity follows since the dual norm of the tensor L 1 -norm is the tensor

3
On the tensor spectral p-norm and its dual norm via partitions

Tensor partitions
A matrix can be partitioned into submatrices, the same can be applied to a tensor. One important class of tensor partitions, block tensors, was proposed and studied in [25,28]. It is a straightforward generalization of block matrices. Li [14] proposed three types of partitions for tensors, namely, modal partitions (an alternative name for block tensors), regular partitions, and tensor partitions, with the latter generalizing the former. Some neat bounds on the tensor spectral norm and nuclear norm based on regular partitions were proposed in [14]. The proofs heavily relied on the recursive structure in defining regular partitions. Since we are extending the results to a more general class of partitions than tensor partitions, we only discuss the definition of tensor partitions and refer modal partitions and regular partitions to [14]. Before presenting the partition concepts, we first discuss notations to describe subtensors of a tensor. It is also an essential step to prove our main bounds to be established in Sect. 3. Suppose that T j is a subtensor of a tensor T ∈ ℝ n 1 ×n 2 ×⋯×n d . We denote the set of its mode-k indices in the original tensor T to be k j for k = 1, 2, … , d . We then let Specifically, T j is a subtensor of T by keeping only the indices in k j of mode-k for k = 1, 2, … , d . Alternatively, T j is a subtensor by deleting all the indices in k ∕ k j of mode-k for k = 1, 2, … , d from the original tensor T . The dimension of the subtensor In our analysis, we do not relabel the indices of some mode of T j , say k j , to {1, 2, … , | k j |} , but keep its original indices in T . We remark that as a tensor partition, every subtensor T j must be a whole block (not disconnected) from the original tensor T . The following observation is straightforward from Definition 2.7.

Proposition 2.8
If T 1 , T 2 , … , T m is a tensor partition of a tensor T where In a similar way, we denote x( k j ) ∈ ℝ | k j | to be the vector by keeping only the entries of x with indices in k j , or the vector by deleting the entries of x whose indices are not in k j . Again, in our analysis, we do not relabel these indices to {1, 2, … , | k j |}. We remark that Proposition 2.8 indeed implies a more general partition concept than the tensor partition in Definition 2.7. We may further drop the requirement of the indices of k j to be consecutive for T j . In this case, T j may consist several disconnected pieces by viewing from the original tensor T but can be put together to form a tensor by deleting empty entries from T (see Example 2.10). Although one can relabel some mode-k indices (similar operations to swapping rows or columns in a matrix) to make one of T j 's to be a tensor with consecutive indices in every mode, it may break other T j 's into disconnected pieces. Hence, one can define a more general partition concept that allows disconnections.

Definition 2.9
A partition T 1 , T 2 , … , T m where and k j ⊂ k for k = 1, 2, … , d and j = 1, 2, … , m is called an arbitrary partition of a tensor T ∈ ℝ n 1 ×n 2 ×⋯×n d if Arbitrary partitions is the most general case of partitioning a tensor. The following example indicates the key difference between a tensor partition and an arbitrary partition for a matrix. Obviously, arbitrary partitions can be far more complicated than tensor partitions for higher order tensors.  In particular, there is no way for a tensor partition of a 4 × 6 matrix consisting of exactly three 2 × 2 matrices and three 1 × 4 matrices. However, an arbitrary partition can make it, such as the partition in the right subfigure of Fig. 1.
Finally in this section, we remark that some T j (either connected or disconnected) in an arbitrary partition of a tensor may not have the same order of the original tensor T . If some k j contains only one index, this causes the disappearance of mode-k and reduces the order of T j by one. However, we still treat this T j as a dth order tensor by keeping the dimension of mode-k to be one. For instance, we can always treat a scalar as a one-dimensional vector, or a one-by-one matrix.

Bounds of the tensor norms
With the establishment of the index system to describe subtensors in an arbitrary partition, we are now in a better position to present and prove the main results in this paper, bounding the spectral p-norm and the nuclear p-norm of a tensor via the spectral p-norms and the nuclear p-norms of subtensors in an arbitrary partition.

Theorem 3.1 If T 1 , T 2 , … , T m is an arbitrary partition of a tensor T and
Proof For an arbitrary partition T 1 , T 2 , … , T m of T , let T j = T 1 j , 2 j , … , d j , where k j ⊂ k for k = 1, 2, … , d and j = 1, 2, … , m . The whole proof is divided into four steps, each one showing one bound in (6) and (7).
For any given T j , we let y k ∈ ℝ | k j | with ‖y k ‖ p = 1 for k = 1, 2, … , d be an optimal solution of max �� Instead of being {1, 2, … , | k j |} , the indices of y k are kept as that of k j for k = 1, 2, … , d . For every k, we define x k ∈ ℝ n k where Clearly we have ‖x k ‖ p = ‖y k ‖ p = 1 . Therefore, proving that max 1≤j≤m ‖T j ‖ p ≤ ‖T‖ p . (2) The upper bound of ‖T‖ p in (6).
Let x k ∈ ℝ n k with ‖x k ‖ p = 1 for k = 1, 2, … , d be an optimal solution of (3), i.e., First, we observe that It is obvious that (8) holds trivially if one of is a zero vector. Otherwise, we get proving that (8) holds in general. Since T 1 , T 2 , … , T m is an arbitrary partition where the first inequality is due to (8), the second inequality follows from the Hölder's inequality, and the last equality holds due to Proposition 2.1 and (3) The lower bound of ‖T‖ p * in (7). For any X ∈ ℝ n 1 ×n 2 ×⋯×n d , let X j = X 1 j , 2 j , … , d j for j = 1, 2, … , m , i.e., X 1 , X 2 , … , X m is an arbitrary partition of X . By the upper bound of (6) proved in (2), we have Therefore, according to the dual property in Lemma 2.5, we have For j = 1, 2, … , m , let y j = ‖X j ‖ p ≥ 0 and further let Z j = X j y j if y j > 0 or Z j = O if y j = 0 . Clearly ‖Z j ‖ p ≤ 1 and we have Therefore, (9) further leads to where the second equality is due to the nonnegativity of y j and max ‖Z j ‖ p ≤1 ⟨T j , Z j ⟩ for any 1 ≤ j ≤ m , the third equality is due to the dual norm property, and the last equality is due to the tightness of the Hölder's inequality. (4) The upper bound of ‖T‖ p * in (7).
For every j = 1, 2, … , m , let T � j ∈ ℝ n 1 ×n 2 ×⋯×n d where By applying a similar approach as we prove (1), it is not difficult to get Therefore, by the triangle inequality, we have proving the last bound. ◻ Theorem 3.1 generalizes and answers affirmatively the conjecture in [14], which is for p = 2 and a tensor partition (a special case of arbitrary partition):

is a tensor partition of a tensor T , then
Theorem 3.1 also provides an alternative proof of a more special case which is for p = 2 and a regular partition (a special case of tensor partition) in [14,

3
On the tensor spectral p-norm and its dual norm via partitions Theorem 3.1], whose proof is based on mathematical induction and heavily relies on the recursive structure in the definition of a regular partition. The novelty of the proof of Theorem 3.1 lies in establishing an index system to describe arbitrary partitions. It also provides a clearer picture relating the subtensors to the original tensor.

Discussions and theoretical applications
The general bounds on the tensor spectral p-norm and nuclear p-norm in Theorem 3.1 provide more insights on dealing with particular tensor instances in practice. Unlike the traditional matrix unfolding technique in which one needs to unfold a tensor in a fixed way, the flexibility on arbitrary partitions of a tensor provides more tools to estimate tensor norms of given tensor data in applications.
In particular, it is useful for some tensors comprised of pieces with known spectral or nuclear p-norms. Let us look into its theoretical applications and see how these bounds connect to other tensor norm bounds in the literature. We first check the tightness of the bounds in Theorem 3.1. Given the flexibility of arbitrary partitions, it is impossible to provide a general necessary and sufficient condition for these bounds to be tight. A trial sufficient condition for all the bounds in Theorem 3.1 to be tight is that all but one of T j 's are zero tensors. The other obvious case is for p = 1 and q = ∞ , under which Theorem 3.1 is reduced to These identities can also be verified by Proposition 2.6 where ‖T‖ 1 = ‖T‖ ∞ and ‖T‖ 1 * = ‖T‖ 1 .
One interesting case is for rank-one tensors, which was already observed in [14] for p = 2 and a regular partition.
is an arbitrary partition of T j for every j. Noticing that any scalar x ∈ ℝ has ‖x‖ p = ‖x‖ p * = �x� , by applying the upper bound of (6) for T and every T j (1 ≤ j ≤ m) , one has and by applying the lower bound of (7) one also has On the other hand, as T is rank-one, one has ‖T‖ p = ‖T‖ q = ‖T‖ q * according to Proposition 2.4. By combining it with (11) and (12), we are lead to the final identity (10). ◻ As we see from the above discussion, both the upper and lower bounds in Theorem 3.1 can be obtained for various cases. In general, the more subtensors in an arbitrary partition, the larger gap between the lower and upper bounds for a generic tensor. In particular, if a partition has m subtensors, the largest possible gap between the lower and upper bounds can be m 1 q when all subtensors have the same spectral p-norm or nuclear p-norm. In an extreme though trivial case where there is only one subtensor in the partition (the original tensor itself), all the bounds become naturally tight. However, due to the curse of dimensionality and the NP-hardness to compute these norms, the larger the subtensors, the more difficulty and inaccuracy in estimating these norms.
We now discuss the main bounds in some special cases to relate existing bounds in the literature. By applying the finest partition T = t i 1 i 2 …i d ∈ ℝ 1×1×⋯×1 ∶ i 1 , i 2 , … , i d ∈ 1 × 2 × ⋯ × d to Theorem 3.1, we obtain the following bounds among tensor norms.

Proposition 4.2 For any tensor
The second inequality of (13), ‖T‖ p ≤ ‖T‖ q , is exactly the one in [23, Theorem 20], and hence it provides an alternatively proof of the upper bound of the tensor spectral p-norm. When p = 2, (13) also implies the bounds proposed in [4, Lemma 9.1]: n k ‖T‖ 2 .

3
On the tensor spectral p-norm and its dual norm via partitions Next, we apply partitions to vector fibers of T to Theorem 3.1, say mode-d fibers, i.e., The bounds tighten that of (13) to the followings: Proposition 4.3 For any tensor T and 1 ≤ p, q ≤ ∞ with 1 p + 1 q = 1, The first inequality of (14) is exactly the one in [23,Proposition 22]. When p = 2 and suppose that n d = max 1≤k≤d n k , the first inequality of (14) also implies the bound in [29,Corollary 4.9]: This is because the largest gap between the lowest and highest bounds in (14) is � ∏ d−1 k=1 n k . Let us now apply partitions to matrix slices and discuss their connections to matrix unfoldings. Matrix unfoldings of a tensor have been one of the main tools to study tensor computation and optimization problems, mainly due to the fact that most tensor problems are NP-hard [10] while the corresponding matrix problems are much easier. One important example is that for the tensor spectral norm and nuclear norm, both are NP-hard when the order of the tensor d ≥ 3 , while they can be computed in polynomial time for a matrix ( d = 2 ). In practice, the tensor nuclear norm is widely used in tensor completion [5,20] as a convex envelope of the tensor rank. In some literature, even the tensor nuclear norm is defined by the average nuclear norms of its matrix unfoldings, as this definition, albeit is different to the original definition, can be computed in polynomial time.
When p = 2 , for the tensor spectral norm, the relations of a tensor and its matrix unfoldings have been studied widely, while that for the tensor nuclear norm was only addressed by Hu [13] and soon again by Friedland and Lim [4]. Wang et al. [29] studied comprehensively on the spectral p-norm based on various matrix unfoldings as well as tensor unfoldings. One obvious way to apply Theorem 3.1 is to partition a tensor into matrix slices. To make a clearer presentation, we mainly discuss third order tensors, which can be easily generalized to higher orders. Let T ∈ ℝ n 1 ×n 2 ×n 3 . Denote Mat 1 (T) ∈ ℝ n 1 ×n 2 n 3 , Mat 2 (T) ∈ ℝ n 2 ×n 1 n 3 , and Mat 3 (T) ∈ ℝ n 3 ×n 1 n 2 to be the mode-1, mode-2, and mode-3 unfolding matrix of T , respectively. For k = 1, 2, 3 , denote T k i to be the ith mode-k matrix slice for i = 1, 2, … , n k ; see the following example.