Computation over t-Product Based Tensor Stiefel Manifold: A Preliminary Study

Mao, Xian-Peng; Wang, Ying; Yang, Yu-Ning

doi:10.1007/s40305-023-00522-z

Computation over t-Product Based Tensor Stiefel Manifold: A Preliminary Study

Published: 09 January 2024

(2024)
Cite this article

Journal of the Operations Research Society of China Aims and scope Submit manuscript

99 Accesses
Explore all metrics

Abstract

Let $ * $ denote the t-product between two third-order tensors proposed by Kilmer and Martin (Linear Algebra Appl 435(3): 641–658, 2011). The purpose of this work is to study fundamental computation over the set $ \textrm{St}\left( n,p,l\right) := \{\mathcal {X} \in \mathbb R^{n\times p \times l} \mid \mathcal {X} ^{\top } * \mathcal {X} = \mathcal I \}$, where $\mathcal {X} $ is a third-order tensor of size $n\times p \times l$ ($n\geqslant p$) and ${\mathcal {I}}$ is the identity tensor. It is first verified that $ \textrm{St}\left( n,p,l\right) $ endowed with the Euclidean metric forms a Riemannian manifold, which is termed as the (third-order) tensor Stiefel manifold in this work. We then derive the tangent space, Riemannian gradient, and Riemannian Hessian on $ \textrm{St}\left( n,p,l\right) $. In addition, formulas of various retractions based on t-QR, t-polar decomposition, t-Cayley transform, and t-exponential, as well as vector transports, are presented. It is expected that analogous to their matrix counterparts, the derived formulas may serve as building blocks for analyzing optimization problems over the tensor Stiefel manifold and designing Riemannian algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intrinsic representation of tangent vectors and vector transports on matrix manifolds

Article 27 October 2016

Polar Decomposition-based Algorithms on the Product of Stiefel Manifolds with Applications in Tensor Approximation

Article 25 March 2023

On matrix exponentials and their approximations related to optimization on the Stiefel manifold

Article 22 October 2018

Data Availability

We do not analyze or generate any datasets, because our work proceeds within a theoretical and mathematical approach.

Notes

For QR factorization of complex matrices, we can choose that R factor is upper triangular with real nonzero diagonal elements.

References

Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)
Article MathSciNet Google Scholar
Comon, P.: Tensors: a brief introduction. IEEE Signal Process. Mag. 31(3), 44–53 (2014)
Article Google Scholar
Cichocki, A., Mandic, D., De Lathauwer, L., Zhou, G., Zhao, Q., Caiafa, C., Phan, H.A.: Tensor decompositions for signal processing applications: from two-way to multiway component analysis. IEEE Signal Process. Mag. 32(2), 145–163 (2015)
Article Google Scholar
Sidiropoulos, N., De Lathauwer, L., Fu, X., Huang, K., Papalexakis, E., Faloutsos, C.: Tensor decomposition for signal processing and machine learning. IEEE Trans. Signal Process. 65(13), 3551–3582 (2017)
Article MathSciNet Google Scholar
Braman, K.: Third-order tensors as linear operators on a space of matrices. Linear Algebra Appl. 433(7), 1241–1253 (2010)
Article MathSciNet Google Scholar
Kilmer, M.E., Martin, C.D.: Factorization strategies for third-order tensors. Linear Algebra Appl. 435(3), 641–658 (2011)
Article MathSciNet Google Scholar
Kilmer, M.E., Braman, K., Hao, N., Hoover, R.C.: Third-order tensors as operators on matrices: a theoretical and computational framework with applications in imaging. SIAM J. Matrix Anal. Appl. 34(1), 148–172 (2013)
Article MathSciNet Google Scholar
Lu, C., Feng, J., Chen, Y., Liu, W., Lin, Z., Yan, S.: Tensor robust principal component analysis with a new tensor nuclear norm. IEEE Trans. Pattern Anal. Mach. Intell. 42(4), 925–938 (2019)
Article Google Scholar
Miao, Y., Qi, L., Wei, Y.: T-Jordan canonical form and t-Drazin inverse based on the t-product. Commun. Appl. Math. Comput. Sci. 3(2), 201–220 (2021)
Article MathSciNet Google Scholar
Lund, K.: The tensor t-function: a definition for functions of third-order tensors. Numer. Linear Algebra Appl. 27(3), e2288 (2020)
Article MathSciNet Google Scholar
Miao, Y., Qi, L., Wei, Y.: Generalized tensor function via the tensor singular value decomposition based on the T-product. Linear Algebra Appl. 590, 258–303 (2020)
Article MathSciNet Google Scholar
Liu, W.H., Jin, X.Q.: A study on T-eigenvalues of third-order tensors. Linear Algebra Appl. 612, 357–374 (2020)
Article MathSciNet Google Scholar
Zheng, M.M., Huang, Z.H., Wang, Y.: T-positive semidefiniteness of third-order symmetric tensors and T-semidefinite programming. Comput. Optim. Appl. 78(1), 239–272 (2021)
Article MathSciNet Google Scholar
Qi, L., Luo, Z.: Tubal matrices (2021). arXiv:2105.00793
Huang, W., Absil, P.A., Gallivan, K.A.: A Riemannian BFGS method without differentiated retraction for nonconvex optimization problems. SIAM J. Optim. 28(1), 470–495 (2018)
Article MathSciNet Google Scholar
Hu, J., Jiang, B., Lin, L., Wen, Z., Yuan, Y.X.: Structured quasi-Newton methods for optimization with orthogonality constraints. SIAM J. Sci. Comput. 41(4), A2239–A2269 (2019)
Article MathSciNet Google Scholar
Chen, S., Ma, S., So, A.M.C., Zhang, T.: Proximal gradient method for nonsmooth optimization over the Stiefel manifold. SIAM J. Optim. 30(1), 210–239 (2020)
Article MathSciNet Google Scholar
Huang, W., Wei, K.: Riemannian proximal gradient methods. Math. Program. 194, 371–413 (2022)
Gao, B., Liu, X., Chen, X., Yuan, Y.X.: A new first-order algorithmic framework for optimization problems with orthogonality constraints. SIAM J. Optim. 28(1), 302–332 (2018)
Article MathSciNet Google Scholar
Hu, J., Liu, X., Wen, Z.W., Yuan, Y.X.: A brief introduction to manifold optimization. J. Oper. Res. Soc. China 8(2), 199–248 (2020)
Article MathSciNet Google Scholar
Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2009)
Google Scholar
Tu, L.W.: An Introduction to Manifolds, 2nd edn. Springer, Universitext, New York (2011)
Book Google Scholar
Boumal, N.: An Introduction to Optimization on Smooth Manifolds. Cambridge University Press, Cambridge (2022)
Google Scholar
Uschmajew, A., Vandereycken, B.: The geometry of algorithms using hierarchical tensors. Linear Algebra Appl. 439(1), 133–166 (2013)
Article MathSciNet Google Scholar
Holtz, S., Rohwedder, T., Schneider, R.: On manifolds of tensors of fixed TT-rank. Numer. Math. 120(4), 701–731 (2012)
Article MathSciNet Google Scholar
Kressner, D., Steinlechner, M., Vandereycken, B.: Low-rank tensor completion by Riemannian optimization. BIT Numer. Math. 54(2), 447–468 (2014)
Article MathSciNet Google Scholar
Heidel, G., Schulz, V.: A Riemannian trust-region method for low-rank tensor completion. Numer. Linear Algebra Appl. 25(6), e2175 (2018)
Article MathSciNet Google Scholar
Steinlechner, M.: Riemannian optimization for high-dimensional tensor completion. SIAM J. Sci. Comput. 38(5), S461–S484 (2016)
Article MathSciNet Google Scholar
Breiding, P., Vannieuwenhoven, N.: A Riemannian trust region method for the canonical tensor rank approximation problem. SIAM J. Optim. 28(3), 2435–2465 (2018)
Article MathSciNet Google Scholar
Gilman, K., Tarzanagh, D.A., Balzano, L.: Grassmannian optimization for online tensor completion and tracking with the t-SVD. IEEE Trans. Signal Process. 70, 2152–2167 (2022)
Article MathSciNet Google Scholar
Song, G.J., Wang, X.Z., Ng, M.K.: Riemannian conjugate gradient descent method for fixed multi rank third-order tensor completion. J. Comput. Appl. Math. 421, 114866 (2023)
Article MathSciNet Google Scholar
Zhang, X., Yang, Z.P., Cao, C.G.: Inequalities involving Khatri–Rao products of positive semidefinite matrices. Appl. Math. E-Notes 2, 117–124 (2002)
MathSciNet Google Scholar
Huang, W.: Optimization algorithms on Riemannian manifolds with applications. Ph.D. thesis, The Florida State University (2013)
Zhu, X.: A Riemannian conjugate gradient method for optimization on the Stiefel manifold. Comput. Optim. Appl. 67(1), 73–110 (2017)
Article MathSciNet Google Scholar
Edelman, A., Arias, T., Smith, S.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303–353 (1998)
Article MathSciNet Google Scholar
Bunse-Gerstner, A., Byers, R., Mehrmann, V.: Numerical methods for simultaneous diagonalization. SIAM J. Matrix Anal. Appl. 14(4), 927–949 (1993)
Article MathSciNet Google Scholar
Pesquet-Popescu, B., Pesquet, J.C., Petropulu, A.P.: Joint singular value decomposition-a new tool for separable representation of images. In: International Conference on Image Processing. vol. 2, pp. 569–572. IEEE, Thessaloniki, Greece (2001)
Shashua, A., Levin, A.: Linear image coding for regression and classification using the tensor-rank principle. In: International Conference on Artificial Intelligence and Statistics. vol. 1, pp. I–42–I–49. IEEE Computer Society, Kauai, HI, USA (2001)
Allen, G.I.: Sparse higher-order principal components analysis. In: International Conference on Artificial Intelligence and Statistics. vol. 22, pp. 27–36. PMLR, La Palma, Canary Islands (2012)
Wang, Y., Dong, M., Xu, Y.: A sparse rank-1 approximation algorithm for high-order tensors. Appl. Math. Lett. 102, 106140 (2020)
Article MathSciNet Google Scholar
Mao, X., Yang, Y.: Several approximation algorithms for sparse best rank-1 approximation to higher-order tensors. J. Glob. Optim. (2022). https://doi.org/10.1007/s10898-022-01140-4
Article MathSciNet Google Scholar
Kwak, N.: Principal component analysis based on $\ell _1$-norm maximization. IEEE Trans. Pattern Anal. Mach. Intell. 30(9), 1672–1680 (2008)
Article Google Scholar
Hao, N., Kilmer, M.E., Braman, K., Hoover, R.C.: Facial recognition using tensor–tensor decompositions. SIAM J. Imaging Sci. 6(1), 437–463 (2013)
Article MathSciNet Google Scholar
Schönemann, P.H.: A generalized solution of the orthogonal procrustes problem. Psychometrika 31(1), 1–10 (1966)
Article MathSciNet Google Scholar
Lin, J., Huang, T.Z., Zhao, X.L., Jiang, T.X., Zhuang, L.: A tensor subspace representation-based method for hyperspectral image denoising. IEEE Tran. Geosci. Remote Sens. 59(9), 7739–7757 (2020)
Article Google Scholar
Xu, S.S., Huang, T.Z., Lin, J., Chen, Y.: T-hy-demosaicing: hyperspectral reconstruction via tensor subspace representation under orthogonal transformation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 14, 4842–4853 (2021)
Article Google Scholar
Xu, T., Huang, T.Z., Deng, L.J., Yokoya, N.: An iterative regularization method based on tensor subspace representation for hyperspectral image super-resolution. IEEE Trans. Geosci. Remote Sens. 60, 1–16 (2022)
Google Scholar
Hoover, R.C., Caudle, K., Braman, K.: Multilinear discriminant analysis through tensor-tensor eigendecomposition. In: ICMLA. pp. 578–584. IEEE, Orlando, FL (2018)
Ozdemir, C., Hoover, R.C., Caudle, K., Braman, K.: High-order multilinear discriminant analysis via order-$n$ tensor eigendecomposition. Technical report, SSRN (2022). https://dx.doi.org/10.2139/ssrn.4203431
Vervliet, N., Debals, O., Sorber, L., Van Barel, M., De Lathauwer, L.: Tensorlab 3.0 (2016). http://www.tensorlab.net
Lu, C.: Tensor-Tensor Product Toolbox. Carnegie Mellon University, Pittsburgh (2018)
Google Scholar
Iannazzo, B., Porcelli, M.: The Riemannian Barzilai–Borwein method with nonmonotone line search and the matrix geometric mean computation. IMA J. Numer. Anal. 38(1), 495–517 (2018)
Article MathSciNet Google Scholar
Kilmer, M.E., Horesh, L., Avron, H., Newman, E.: Tensor–tensor algebra for optimal representation and compression of multiway data. Proc. Natl. Acad. Sci. U.S.A. 118(28), e2015851118 (2021)
Article MathSciNet Google Scholar
Kernfeld, E., Kilmer, M., Aeron, S.: Tensor–tensor products with invertible linear transforms. Linear Algebra Appl. 485, 545–570 (2015)
Article MathSciNet Google Scholar
Hall, B.C.: Lie Groups, Lie Algebras, and representations. Springer, Cham (2015)
Book Google Scholar
Van Loan, C.: Computing integrals involving the matrix exponential. IEEE Trans. Autom. Control 23(3), 395–404 (1978)
Article MathSciNet Google Scholar
Van Loan, C.F.: The ubiquitous kronecker product. J. Comput. Appl. Math. 123(1–2), 85–100 (2000)
Kolda, T.G.: Multilinear operators for higher-order decompositions. Tech. Rep. SAND2006-2081, 923081, Citeseer (2006)

Download references

Author information

Authors and Affiliations

School of Physical Science and Technology, Guangxi University, Nanning, 530004, Guangxi, China
Xian-Peng Mao
College of Mathematics and Information Science, Guangxi University, Nanning, 530004, Guangxi, China
Ying Wang & Yu-Ning Yang
Center for Applied Mathematics of Guangxi, Guangxi University, Nanning, 530004, Guangxi, China
Yu-Ning Yang

Authors

Xian-Peng Mao
View author publications
You can also search for this author in PubMed Google Scholar
Ying Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Ning Yang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

X. -P. Mao, Y. Wang and Y. -N. Yang deduced the theories, designed the algorithms, performed the numerial experiments, drafted the manuscript, read and approved the final manuscript.

Corresponding author

Correspondence to Yu-Ning Yang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

This work was supported by the National Natural Science Foundation of China (No. 12171105), Fok Ying Tong Education Foundation (No. 171094), and the special foundation for Guangxi Bagui Scholars.

Appendices

Appendix

A Preliminaries on Riemannian Manifold

Basic definitions and properties concerning the Riemannian manifold can be found in the books [21,22,23]. To be more convenient and to make the paper self-contained, we summarize the necessary ones in this section.

Definition 13

[21, 22] A topological manifold $\mathscr {M}$ of dimension n is a Hausdorff, second countable, locally Euclidean dimension n space. Let $\mathscr {N}$ be a submanifold of $\mathscr {M}$. If the manifold topology of $\mathscr {N}$ coincides with its subspace topology induced from the topological space $\mathscr {M}$, then $\mathscr {N}$ is called an embedded submanifold of the manifold $\mathscr {M}$.

Definition 14

[21] A tangent vector $\xi _{x}$ to $\mathscr {M}$ at x is defined as a mapping from $\mathfrak {F}_{x}(\mathscr {M})$ to $\mathbb {R}$ such that $ \xi _{x} f:=\dot{\gamma }(0) f:= \frac{\textrm{d}}{\textrm{d} t}f(\gamma (t))\mid _{t=0}, \quad \forall f \in \mathfrak {F}_{x}(\mathscr {M}), $ for some smooth curve $\gamma (t)$ on $\mathscr {M}$ with $\gamma (0)=x$. The tangent space $T_{x} \mathscr {M}$ to $\mathscr {M}$ is defined as the set of all tangent vectors to $\mathscr {M}$ at x. $T\mathscr {M}:=\bigcup _{x \in \mathscr {M}} T_{x} \mathscr {M}.$ is called the tangent bundle of the manifold.

Definition 15

[21] The differential of $F: \mathscr {M} \rightarrow \mathscr {N}$ at x is a linear operator $\textrm{D}F(x): T_{x} \mathscr {M} \rightarrow T_{F(x)} \mathscr {N}$ defined by: $ \textrm{D} F(x)[v]:=\frac{\textrm{d}}{\textrm{d} t} F(\gamma (t))\mid _{t=0}, $ where $\gamma (t)$ is any curve on the manifold that satisfies $\gamma (0)=x$ and $\dot{\gamma }(0)=v$.

Definition 16

[21] A Riemannian metric g is defined on each tangent space of x as an inner product $g_{x}: T_{x} \mathscr {M} \times T_{x} \mathscr {M} \rightarrow \mathbb {R}$. $ g_{x}(\eta , \xi )=\langle \eta , \xi \rangle _{x} $ where $\eta , \xi \in T_{x} \mathscr {M}$. A Riemannian manifold is the combination $(\mathscr {M}, g)$.

Definition 17

[33] The geodesic $\gamma (t)$ defined by an affine connection is a curve that satisfies $ \ddot{\gamma }(t):=\frac{\textrm{D}^{2}}{\textrm{d} t^{2}} \gamma (t):=\frac{\textrm{D}}{\textrm{d} t} \dot{\gamma }(t) =0, $ where $\frac{\textrm{D}}{\textrm{d} t}$ is the induced covariant derivative (see [23, Thm. 5.29]).

Definition 18

[21] The Riemannian gradient ${\text {grad}}f(x)$ of a function f at x is an unique vector in $T_x\mathscr {M}$ satisfying $\langle {\text {grad}} f(x), \xi _x\rangle _{x}=\textrm{D} f(x)[\xi _x], \quad \forall \xi _x \in T_{x} \mathscr {M}.$ The Riemannian Hessian $ {\text {Hess}}f(x)$ is a mapping from the tangent space $T_{x} \mathscr {M}$ to the tangent space $T_{x} \mathscr {M}$: $ {\text {Hess}} f(x)[\xi ]:={\nabla }_{\xi } {\text {grad}} f(x), $ where ${\nabla }$ is the Riemannian connection on $\mathscr {M}$ (see [21, Thm. 5.3.1]).

Lemma 8

[23] Let $\mathscr {M}$ be a Riemannian submanifold of a Euclidean space $\mathscr {E}$ and let $f: \mathscr {M} \rightarrow \mathbb {R}$ be a smooth function. Then,

$$\begin{aligned} \begin{aligned} {\text {grad}} f(x)&=\varvec{P}_x({\text {grad}} {\bar{f}}(x)), \\ {\text { Hess }} f(x)[u]&=\varvec{P}_x(\textrm{D} {\bar{G}}(x)[u]), u \in T_{x} \mathscr {M}, \end{aligned} \end{aligned}$$

where $\textrm{D}$ is the Euclidean derivative, $\varvec{P}_x(y) $ denotes the orthogonal projection from $\mathscr {E}$ to $T_{x} \mathscr {M}$, and smooth scalar field $\bar{f}$ ( vector field $\bar{G}$) is any smooth extension of f (G) to a neighborhood of $\mathscr {M}$ in $\mathscr {E}$.

Retraction provides a method to map the tangent vector to the next iterate on the manifold.

Definition 19

(cf. [21, Def. 4.1.1]) A retraction on a manifold $\mathscr {M}$ is a smooth mapping R from the tangent bundle $T\mathscr {M} $ onto $\mathscr {M}$. Let $R_x$ denote the restriction of R to $T_x\mathscr {M}$, $(i)~R_x(0_x) = x$, where $0_x$ denotes the zero element of $T_x\mathscr {M} $, and $(ii)~\textrm{D} R_x(0_x):T_x\mathscr {M}\mapsto T_x\mathscr {M}$ is the identity map: $\textrm{D} R_x(0_x)[v] = v$.

For the embedded submanifold of a vector space, a simple way to construct retractions is specified in the following.

Lemma 9

(cf. [21, Prop. 4.1.2]) Let $\mathscr {M}$ be an embedded manifold of a vector space $\mathscr {E}$ and let $\mathscr {N}$ be an abstract manifold such that $\dim {{\mathscr {M}}}+\dim {{\mathscr {N}}}=\dim {{\mathscr {E}}}$. Assume that there is a diffeomorphism $ \phi :\mathscr {M} \times \mathscr {N}\rightarrow \mathscr {E}_{*}:(F,G)\mapsto \phi (F,G), $ where $\mathscr {E}_{*}$ is an open subset of $\mathscr {E}$(thus $\mathscr {E}_{*}$ is an open submanifold of $\mathscr {E}$), with a neutral element $I \in \mathscr {N}$ satisfying $ \phi (F,I) = F,~~ \forall F \in \mathscr {M}. $ Then the mapping $ R_{X}(\xi ) = \pi _1 \left( \phi ^{-1}(X+\xi ) \right) , $ where $ \pi _1:\mathscr {M} \times \mathscr {N}\rightarrow \mathscr {M}:(F,G)\mapsto F$ is the projection onto the first component, defines a retraction on $\mathscr {M}$.

To compare tangent vectors at distinct points on the manifold, the vector transport upon retraction R

gives us a way to transport a tangent vector $\xi \in T_x\mathscr {M}$ to the tangent space $T_{R_x(\eta )}\mathscr {M} $ for some $\eta \in T_x\mathscr {M} $.

Definition 20

(cf. [21, Def. 8.1.1]) A vector transport $\mathcal {T} : T\mathscr {M} \oplus T\mathscr {M} \rightarrow T\mathscr {M}:(\eta _x,\xi _x ) \mapsto \mathcal {T}_{\eta _x}\xi _x$ associated with a retraction R is a smooth mapping satisfying the following properties for all $x\in \mathscr {M}$: (i) $\mathcal {T} _{\eta _x}\xi _x\in T_{R_x(\eta _x)}\mathscr {M}$, (ii) $\mathcal {T} _{0_{x}}\xi _x = \xi _x$ for all $\xi _x\in T_x\mathscr {M}$, and $(iii)~\mathcal {T} _{\eta _{x}}(a\xi _x+b\zeta ) = a\mathcal {T}_{\eta _{x}}\xi _x+b\mathcal {T}_{\eta _{x}}\zeta $. Vector transport by differentiated retraction is defined as

$$\begin{aligned} \mathcal {T}_{\eta _x}\xi _x:= \textrm{D} R_{x}(\eta _x)[\xi _x] = \frac{\textrm{d}}{\textrm{d}t}R_{x}(\eta _x+t\xi _x)\mid _{t=0}. \end{aligned}$$

(A1)

Lemma 10

(cf. [21, Sect. 8.1.3]) A vector transport on $\mathscr {M}$ associated with a retraction R is given by the orthogonal projection onto the tangent space, i.e., $ \mathcal {T}_{\eta _x}{\xi _x} = \varvec{P}_{R_x(\eta _x)}\xi _x. $

Definition 21

[34] A vector transport $\mathcal {T}$ is called isometric if for all $\eta , \xi \in T_{x} \mathscr {M}$, it satisfies $ \left\langle \mathcal {T}_{\eta }(\xi ), \mathcal {T}_{\eta }(\xi )\right\rangle _{R_{x}(\eta )}=\langle \xi , \xi \rangle _{x} $ where R is the retraction associated with $\mathcal {T}$.

B Proofs of Theorems, Propositions and Lemmas in Sect. 2

1.1 B.1 Proof of Proposition 2

Proof

Taking the conjugate transpose of both sides of the equation in item (i) of Proposition 1, then multiplying both sides by $(F_l\otimes I_n)$, we get

$$\begin{aligned} (F_l\otimes I_p){\text {bcirc}}(\mathcal {A}^{\top }) ={\text {Diag}}\left( (\hat{A}^{(i)})^H: i \in [l]\right) (F_l\otimes I_n), \end{aligned}$$

where we use the following property ( [9, Lem. 3]): ${\text {bcirc}}(\mathcal {A})^{\top } = {\text {bcirc}}(\mathcal {A}^{\top }) $. Taking the first column of the block matrix on both sides of the above equation yields

$$\begin{aligned} (F_l\otimes I_p){\text {unfold}}(\mathcal {A}^{\top })= & {} \frac{1}{\sqrt{l}}{\text {Diag}}\left( (\hat{A}^{(i)})^H: i \in [l]\right) {\text {Vec}}\left( I_n: i \in [l]\right) \\= & {} \frac{1}{\sqrt{l}} {\text {Vec}}\left( (\hat{A}^{(i)})^H: i \in [l]\right) , \end{aligned}$$

which combing with Definition 6 gives $L(\mathcal {A} ^{\top }) = {\text {fold}}\left( (\hat{A}^{(i)})^{H}: i \in [l]\right) .$

1.2 B.2 Proof of Theorem 3

Proof

t-QR was proposed in [7, Sect. 2.5].

For $i= 1, \cdots , \lceil \frac{l+1}{2}\rceil ,$ let $ \hat{A}^{(i)} = \hat{Q}^{(i)}\cdot \hat{R}^{(i)}$ be the QR decomposition of $\hat{A}^{(i)}\in \mathbb C^{n\times p} $^{Footnote 1} where $\hat{Q}^{(i)}\in \mathbb C^{n\times p} , (\hat{Q}^{(i)})^H\cdot \hat{Q}^{(i)} = I_p$, $\hat{R}^{(i)}\in \mathbb C^{p\times p} _{\textrm{upp}}$ and ${\text {diag}}(\hat{R}^{(i)})\in \mathbb R^{p\times p} $, namely, the diagonal entries of ${\hat{R}}^{(i)}$ are real. For $i=1+ \lceil \frac{l+1}{2} \rceil ,\cdots ,l$, $ {\hat{A}}^{(i)} = \textrm{conj}\left( {\hat{A}}^{(l+2-i)}\right) , {\hat{Q}}^{(i)} = \textrm{conj}\left( {\hat{Q}}^{(l+2-i)}\right) , {\hat{R}}^{(i)} = \textrm{conj}\left( {\hat{R}}^{(l+2-i)}\right) . $ It follows from Remarks 3 and 4 that ${\mathcal {Q}} \in \textrm{St}\left( n,p,l\right) $ and ${\mathcal {R}}\in \mathbb {R}_{\textrm{upp}}^{p\times p \times l} $. Here ${\mathcal {R}}$ to be real is because of Remark 3 and direct computation. Using Remark 3 again, further we have $\hat{A}^{(1)}\in \mathbb R^{n\times p} , \hat{Q}^{(1)}\in \mathbb R^{n\times p} , \hat{R}^{(1)}\in \mathbb R^{p\times p} $.

We then show the uniqueness of the decomposition. As we know, for QR decomposition of a matrix $\hat{A}^{(i)}\in \mathbb C^{n\times p} $ with $n\geqslant p$, if $\hat{A}^{(i)}, i \in [l]$ are of full rank p, namely, $\hat{\mathcal {A} }\in \mathbb C^{n\times p \times l} _*$, then the QR decomposition $\hat{A}^{(i)} = \hat{Q}^{(i)}\hat{R}^{(i)}$ are unique if we require that the diagonal entries of $\hat{R}^{(i)}$ are all positive, namely, $\hat{\mathcal {R} }\in \mathbb {C}_{\textrm{upp}+}^{p\times p \times l} $. Since the Fourier transform is bijective, the uniqueness of the matrix QR decomposition leads to the uniqueness of the t-QR decomposition.

1.3 B.3 Proof of Lemma 3

Proof

The proof of Theorem 3 shows that S is isomorphic to

If l is even, then it holds that

$$\begin{aligned} \begin{aligned} \hat{S} = \Biggl \{\hat{\mathcal {R} }~\bigg |~&\hat{R}^{(1)},\hat{R}^{(\frac{l}{2}+1)}\in \mathbb R^{p\times p} _{\textrm{upp}+}, \hat{R}^{(i)}\in \mathbb C^{p\times p} _{\textrm{upp}+}, i = [l]\setminus \{1,\frac{l}{2}+1\},\\&\hat{R}^{(i)} = {\text {conj}}(\hat{R}^{(l+2-i)}), i = 2,\cdots ,\frac{l}{2} \Biggr \}. \end{aligned} \end{aligned}$$

Then we examine matrices $\hat{R}^{(i)}, i \in [l]$ containing free variables. There are two real upper triangular $p\times p$ matrices, both of dimension $\frac{(1+p)p}{2}$; there are $\frac{l-2}{2}$ complex upper triangular $p\times p$ matrices with positive diagonal elements, both of dimension $\frac{(p-1)p}{2}\times 2 + p$. Hence the dimension of $\hat{S}$ is $2\times \frac{(1+p)p}{2}+\frac{l-2}{2}\times \big (\frac{(p-1)p}{2}\times 2 + p\big )=\frac{p^2l}{2}+p$.

If l is odd, then it holds that

$$\begin{aligned} \begin{aligned} \hat{S} = \Biggl \{\hat{\mathcal {R} }~\bigg |~&\hat{R}^{(1)}\in \mathbb R^{p\times p} _{\textrm{upp}+}, \hat{R}^{(i)}\in \mathbb C^{p\times p} _{\textrm{upp}+}, i = [l]\setminus \{1\},\\&\hat{R}^{(i)} = {\text {conj}}(\hat{R}^{(l+2-i)}), i = 2,\cdots ,\frac{l+1}{2} \Biggr \}. \end{aligned} \end{aligned}$$

There is one real upper triangular $p\times p$ matrix of dimension $\frac{(1+p)p}{2}$; there are $\frac{l-1}{2}$ complex upper triangular $p\times p$ matrices with positive diagonal elements, both of dimension $\frac{(p-1)p}{2}\times 2 + p$. Hence the dimension of $\hat{S}$ is $ \frac{(1+p)p}{2}+\frac{l-1}{2}\times \big (\frac{(p-1)p}{2}\times 2 + p\big )=\frac{p^2l+p}{2}$.

1.4 B.4 Proof of Theorem 4

Proof

Let the compact t-SVD of ${\mathcal {A}}= {\mathcal {U}} * \mathcal S * {\mathcal {V}}^{\top }$. Let ${\mathcal {P}}:=\mathcal U * {\mathcal {V}}^{\top }$ and ${\mathcal {H}}:= \mathcal V * {\mathcal {S}} * {\mathcal {V}}^{\top }$. Then it is clear that (8) is satisfied. To see that ${\mathcal {H}}\in \textrm{Sym}(\mathbb R_+^{p\times p \times l}) $, first we show that ${\mathcal {S}}\in \textrm{Sym}(\mathbb R_+^{p\times p \times l}) $. This is obvious, as each ${\hat{S}}^{(i)}$ is diagonal with nonnegative entries, and so ${\mathcal {S}}\in \textrm{Sym}(\mathbb R_+^{p\times p \times l}) $, according to Remark 5. By [13, Thm. 7], there is a unique ${\mathcal {T}}$ such that ${\mathcal {T}} * {\mathcal {T}}^{\top }= \mathcal S$. Then ${\mathcal {H}} $ can be written as ${\mathcal {H}} = \mathcal V * {\mathcal {T}} * \left( {\mathcal {V}} * \mathcal T \right) ^{\top }$, which together with [13, Thm. 8] shows that ${\mathcal {H}} \in \textrm{Sym}(\mathbb R_+^{p\times p \times l}) $.

To show the uniqueness of ${\mathcal {H}}$, note that $ \mathcal A^{\top } * {\mathcal {A}}={\mathcal {H}} * {\mathcal {H}}$, which by [13, Thm. 8] is clearly symmetric positive semidefinite. Revoking again [13, Thm. 7] gives the uniqueness of ${\mathcal {H}}$.

If ${\mathcal {A}}^{\top } * {\mathcal {A}}\in \textrm{Sym}(\mathbb R_{++}^{p\times p \times l}) $, [13, Thm. 8] shows that ${\mathcal {H}}$ is nonsingular (invertible, Def. 5), and so ${\mathcal {P}} = \mathcal A * {\mathcal {H}}^{-1}$, which is unique.

Remark 16

The proof of Theorem 4 gives the way to obtain t-PD from the compact t-SVD. This is analogous to the matrix case.

1.5 B.5 Proof of Proposition 6

Proof

This can be easily derived from the proof of Theorem 4. Here the root of a symmetric positive definite tensor was defined in [13, Thm. 7].

1.6 B.6 Proof of Proposition 7

Proof

If $\hat{\mathcal {A}}\in \mathbb {C}_*^{n\times p \times l}$, then $(\hat{A}^{(i)})^H\hat{A}^{(i)}, i \in [l]$ are Hermitian positive definite. Note that [13, Thm. 5] shows that $(\hat{A}^{(i)})^H\hat{A}^{(i)}, i \in [l]$ are Hermitian positive definite if only if $ \mathcal {A}^{\top }*\mathcal {A}\in \textrm{Sym}(\mathbb R_{++}^{p\times p \times l}) $.

1.7 B.7 Proof of Theorem 5

Proof

Let ${\mathcal {D}}:={\mathcal {U}}^{\top }\in \mathbb R^{p\times n \times l} $. Then for any ${\mathcal {P}}\in \textrm{St}\left( n,p,l\right) $,

$$\begin{aligned} l \left\langle {\mathcal {A}} , {\mathcal {P}}\right\rangle&= \textrm{tr}\left( {\mathcal {A}}^{\top } * {\mathcal {P}} \right) \\&= \textrm{tr}\left( {\mathcal {V}} * {\mathcal {S}} * {\mathcal {U}}^{\top } * {\mathcal {P}} \right) \\&= \textrm{tr}\left( {\mathcal {S}} * {\mathcal {D}} * {\mathcal {P}} * {\mathcal {V}} \right) \\&= \textrm{tr}\left( {\hat{S}} {\hat{D}}{\hat{P}}{\hat{V}} \right) \\&= \sum ^l_{i=1} \textrm{tr}\left( {\hat{S}}^{(i)} {\hat{D}}^{(i)} {\hat{P}}^{(i)} {\hat{V}}^{(i)} \right) = \sum ^l_{i=1} \textrm{tr}\left( {\hat{S}}^{(i)} {\hat{W}}^{(i)} \right) , \end{aligned}$$

where we let ${\hat{W}}^{(i)}:= {\hat{D}}^{(i)}{\hat{P}}^{(i)}{\hat{V}}^{(i)} \in \mathbb C^{p\times p}$. Note that ${\hat{D}}^{(i)} ({\hat{D}}^{(i)})^H = I_p$, $({\hat{P}}^{(i)})^H {\hat{P}}^{(i)} = I_p$, $({\hat{V}}^{(i)})^H{\hat{V}}^{(i)} = I_p$. Thus $ \mid ({\hat{W}}^{(i)})_{jj} \mid \leqslant 1 $, $i \in [l]$, $j \in [p]$. Therefore,

$$\begin{aligned} \sum ^l_{i=1} \textrm{tr}\left( {\hat{S}}^{(i)} {\hat{W}}^{(i)} \right)&= \sum ^l_{i=1}\sum ^p_{j=1} ({\hat{S}}^{(i)})_{jj} ({\hat{W}}^{(i)})_{jj} \\&\leqslant \sum ^l_{i=1}\sum ^p_{j=1}({\hat{S}}^{(i)})_{jj} \mid {\hat{W}}^{(i)}_{jj}\mid \\&\leqslant \sum ^l_{i=1} \textrm{tr}\left( {\hat{S}}^{(i)} \right) = \textrm{tr}\left( {\hat{S}} \right) , \end{aligned}$$

where ${\hat{S}}^{(i)}\geqslant 0$. On the other hand, take $\mathcal P:={\mathcal {U}} * {\mathcal {V}}^{\top }$. It is easy to see that

$$\begin{aligned} l \left\langle {\mathcal {A}} , {\mathcal {P}}\right\rangle = \textrm{tr}\left( {\hat{S}} \right) , \end{aligned}$$

namely, the upper bound is tight, which is achieved when ${\mathcal {P}} = {\mathcal {U}} * {\mathcal {V}}^{\top }$. This gives the desired result.

1.8 B.8 Proof of the Well-Defined Property of (10)

Proof

To be convenient, we will use the notation $ \Delta $ as the frontal-slice-wise product (cf. [54, Def. 2.1]) between two tensors in the Fourier domain, i.e., if ${\hat{C}}^{(i)} = {\hat{A}}^{(i)}{\hat{B}}^{(i)}, i \in [l],$ then it holds that $L(\mathcal {A} )\Delta L(\mathcal {B} ) = {\text {fold}}\left( \hat{A}^{(i)}\hat{B}^{(i)}: i \in [l]\right) $; in other words,

$$\begin{aligned} L(\mathcal {C} ) = L(\mathcal {A} )\Delta L(\mathcal {B} ) \Leftrightarrow {\hat{C}}^{(i)} = {\hat{A}}^{(i)}{\hat{B}}^{(i)}, i \in [l] \Leftrightarrow \mathcal {C} = {\mathcal {A}} * {\mathcal {B}}. \end{aligned}$$

(A2)

Using this notation, we have

$$\begin{aligned} \mathcal {A}^k = L^{-1} \left( L(\mathcal {A})\Delta \cdots \Delta L(\mathcal {A}) \right) = L^{-1} \left( {\text {fold}}\left( (\hat{{A}}^{(i)})^k: i \in [l]\right) \right) . \end{aligned}$$

Thus for any $N$,

$$\begin{aligned} \sum \limits _{k=0}^{N} \frac{1}{k!} \mathcal {A}^k&=\sum \limits _{k=0}^{N} \frac{1}{k!} L^{-1} \left( {\text {fold}}\left( (\hat{{A}}^{(i)})^k: i \in [l]\right) \right) \\&=L^{-1} \left( {\text {fold}}\left( \sum _{k=0}^N \frac{1}{k!} (\hat{{A}}^{(i)})^k: i \in [l]\right) \right) . \end{aligned}$$

Let $N \rightarrow \infty $, it holds that

$$\begin{aligned} {\text {exp}} \left[ \mathcal {A} \right]= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ \hat{{A}}^{(i)} \right] : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ (L(\mathcal {A} ))^{(i)} \right] : i \in [l]\right) \right) ,\end{aligned}$$

since the series defining the matrix exponential is convergent [55, Prop. 2.1].

1.9 B.9 Proof of Equivalence of (9) and (11)

Proof

Using (9) and item (i) of Proposition 1, we have

$$\begin{aligned} {\text {exp}} \left[ \mathcal {A} \right]= & {} \textrm{fold} \left( {\text {exp}} \left[ \textrm{bcirc}(\mathcal {A}) \right] \textrm{unfold}(\mathcal {I}) \right) \\= & {} \textrm{fold} \left( {\text {exp}} \left[ \big (F^H_l \otimes I_n\big ) {\text {Diag}}\left( \hat{{A}}^{(i)}: i \in [l]\right) (F_{l} \otimes I_n) \right] \textrm{unfold}(\mathcal {I}) \right) \\= & {} \textrm{fold} \left( \big (F^H_l \otimes I_n\big ) {\text {exp}} \left[ {\text {Diag}}\left( \hat{{A}}^{(i)}: i \in [l]\right) \right] (F_{l} \otimes I_n)\textrm{unfold}(\mathcal {I}) \right) \\= & {} \textrm{fold} \left( \big (F^H_l \otimes I_n\big ) {\text {exp}} \left[ {\text {Diag}}\left( \hat{{A}}^{(i)}: i \in [l]\right) \right] \frac{1}{\sqrt{l}} {\text {Vec}}\left( I_n: i \in [l]\right) \right) \\= & {} \textrm{fold} \left( \frac{1}{\sqrt{l}} \big (F^H_l \otimes I_n\big ) {\text {Diag}}\left( {\text {exp}} \left[ \hat{{A}}^{(i)} \right] : i \in [l]\right) {\text {Vec}}\left( I_n: i \in [l]\right) \right) \\= & {} \textrm{fold} \left( \frac{1}{\sqrt{l}} \big (F^H_l \otimes I_n\big ) {\text {Vec}}\left( {\text {exp}} \left[ \hat{{A}}^{(i)} \right] : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ \hat{{A}}^{(i)} \right] : i \in [l]\right) \right) , \end{aligned}$$

where the third equality is due to the following property of the matrix exponential ([55, Prop. 2.3, 6]): If $X^{\top }X=I$, then $ {\text {exp}} \left[ XAX^{\top } \right] = X {\text {exp}} \left[ A \right] X^{\top },$ and the fifth equality comes from the following formula which follows immediately from definition: $ {\text {exp}} \left[ {\text {Diag}}\left( D_{i}: i \in [l]\right) \right] = {\text {Diag}}\left( {\text {exp}} \left[ D_i \right] : i \in [l]\right) $, and the last equality follows from (5).

1.10 B.10 Proof of Proposition 8

Proof

Since the t-exponential mapping

$$\begin{aligned} \begin{aligned}&{\text {exp}} \left[ \mathcal {A} \right] =&L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ L(\mathcal {A})^{(i)} \right] : i \in [l]\right) \right) \end{aligned} \end{aligned}$$

is the composite of the matrix exponential mapping and linear mappings and the matrix exponential is smooth ( [55, Prop. 2.16]), we conclude that the t-exponential mapping is smooth.

1.11 B.11 Proof of Proposition 9

Proof

Using the corresponding property of the matrix exponential [55, Prop. 2.4], we obtain

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t} {\text {exp}} \left[ t\mathcal {A} \right]= & {} \frac{\textrm{d}}{\textrm{d}t} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ t\hat{{A}}^{(i)} \right] : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( \frac{\textrm{d}}{\textrm{d}t} {\text {exp}} \left[ t\hat{{A}}^{(i)} \right] : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ t\hat{{A}}^{(i)} \right] \hat{{A}}^{(i)}: i \in [l]\right) \right) \\= & {} L^{-1} \left( L( {\text {exp}} \left[ t\mathcal {A} \right] )\Delta L(\mathcal {A}) \right) = {\text {exp}} \left[ t\mathcal {A} \right] *\mathcal {A}, \end{aligned}$$

where the first equality comes from (11), while (A2) gives the last two equality. Similarly, we can show that $\frac{\textrm{d}}{\textrm{d}t} {\text {exp}} \left[ t\mathcal {A} \right] =\mathcal {A}* {\text {exp}} \left[ t\mathcal {A} \right] $.

1.12 B.12 Proof of Proposition 10

Proof

Applying the corresponding property in the matrix case [55, Prop. 2.3, 6] and (A2), it follows that

$$\begin{aligned} {\text {exp}} \left[ \mathcal {X} *\mathcal {A}*\mathcal {X} ^{\top } \right]= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ \left( L(\mathcal {X} *\mathcal {A}*\mathcal {X} ^{\top }) \right) ^{(i)} \right] : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ (L(\mathcal {X} )\Delta L(\mathcal {A}) \Delta L(\mathcal {X} ^{\top }))^{(i)} \right] : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ \hat{{X}}^{(i)}\hat{{A}}^{(i)}(\hat{{X}}^{(i)})^{H} \right] : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( \hat{{X}}^{(i)} {\text {exp}} \left[ \hat{{A}}^{(i)} \right] (\hat{{X}}^{(i)})^{H}: i \in [l]\right) \right) \\= & {} L^{-1} \left( L(\mathcal {X} )\Delta L( {\text {exp}} \left[ \mathcal {A} \right] )\Delta L(\mathcal {X} ^{\top }) \right) =\mathcal {X} * {\text {exp}} \left[ \mathcal {A} \right] *\mathcal {X} ^{\top }, \end{aligned}$$

where the first equality comes from (11).

1.13 B.13 Proof of Proposition 11

Proof

We denote $\mathcal {A} = {\text {Diag}}\left( \mathcal {D}_j: j \in [p]\right) $ and $\mathcal {B} = {\text {Diag}}\left( {\text {exp}} \left[ \mathcal {D}_j \right] : j \in [p]\right) $. Applying (11), we get

$$\begin{aligned} {\text {exp}} \left[ \mathcal {A} \right]= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ \hat{A}^{(i)} \right] : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ {\text {Diag}}\left( \hat{{D}_j}^{(i)}: j \in [p]\right) \right] : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( {\text {Diag}}\left( {\text {exp}} \left[ \hat{{D}_j}^{(i)} \right] : j \in [p]\right) : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( \hat{B}^{(i)}: i \in [l]\right) \right) =\mathcal {B} , \end{aligned}$$

where the third equality is due to the property of the matrix exponential [56]: $ {\text {exp}} \left[ {\text {Diag}}\left( C_{i}: i \in [l]\right) \right] = {\text {Diag}}\left( {\text {exp}} \left[ C_i \right] : i \in [l]\right) $.

1.14 B.14 Proof of Proposition 12

Proof

It follows from Proposition 2 that

$$\begin{aligned} ( {\text {exp}} \left[ \mathcal {A} \right] )^{\top }= & {} L^{-1} \left( {\text {fold}}\left( \left( {\text {exp}} \left[ \hat{A}^{(i)} \right] \right) ^H: i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ (\hat{{A}}^{(i)})^H \right] : i \in [l]\right) \right) = {\text {exp}} \left[ \mathcal {A}^{\top } \right] , \end{aligned}$$

where the second equality comes from the corresponding property in the matrix case [55, Prop. 2.3, 2].

1.15 B.15 Proof of Proposition 13

Proof

Using (A2), we have

$$\begin{aligned} {\text {exp}} \left[ \mathcal {A} \right] * {\text {exp}} \left[ \mathcal {B} \right]= & {} L^{-1}(L( {\text {exp}} \left[ \mathcal {A} \right] )\Delta L( {\text {exp}} \left[ \mathcal {B} \right] ))\\= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ \hat{{A}}^{(i)} \right] {\text {exp}} \left[ \hat{{B}}^{(i)} \right] : i \in [l]\right) \right) \\= & {} L^{-1} \left( {\text {fold}}\left( {\text {exp}} \left[ \hat{{A}}^{(i)}+\hat{{B}}^{(i)} \right] : i \in [l]\right) \right) = {\text {exp}} \left[ \mathcal {A}+\mathcal {B} \right] , \end{aligned}$$

where the third equality comes from the property in the matrix exponential [55, Prop. 2.3, 5].

C Proofs of the Analytical Solution of the t-Sylvester Equation in Theorem 17

Lemma 11

Let $\mathcal {A}\in \mathbb {R}^{m\times n \times l},\mathcal {B}\in \mathbb {R}^{n\times k\times l} $. Then

$$\begin{aligned} (I_{kl}\otimes [A^{(1)},\cdots ,A^{(l)}]){\text {vec}}(\widetilde{{\text {bcirc}}}(\mathcal {B}))\\ = ([I_k]_{l\times l}\circledast {\text {bcirc}}(\mathcal {A}) ){\text {vec}}(\mathcal {B}). \end{aligned}$$

Proof

By definition, the left-hand side part is

$$\begin{aligned}{} & {} \begin{bmatrix} [A^{(1)},\cdots ,A^{(l)}] &{}\quad \ddots &{} &{} &{} &{} &{} \\ &{} &{}\quad [A^{(1)},\cdots ,A^{(l)}] &{} &{} &{} &{} \\ &{} &{} &{}\quad \ddots &{} &{} &{} \\ &{} &{} &{} &{} \quad [A^{(1)},\cdots ,A^{(l)}] &{} &{} \\ &{} &{} &{} &{} &{}\quad \ddots &{} \\ &{} &{} &{} &{} &{} &{}\quad [A^{(1)},\cdots ,A^{(l)}] \end{bmatrix}_{kl} \cdot \begin{bmatrix} \begin{bmatrix} B^{(1)}\\ B^{(l)}\\ \vdots \\ B^{(2)} \end{bmatrix}_{:1}\\ \vdots \\ \begin{bmatrix} B^{(1)}\\ B^{(l)}\\ \vdots \\ B^{(2)} \end{bmatrix}_{:k} \\ \vdots \\ \begin{bmatrix} B^{(l)}\\ B^{(l-1)}\\ \vdots \\ B^{(1)} \end{bmatrix}_{:1}\\ \vdots \\ \begin{bmatrix} B^{(l)}\\ B^{(l-1)}\\ \vdots \\ B^{(1)} \end{bmatrix}_{:k} \end{bmatrix}, \end{aligned}$$

where $B^{(i)}_{:j}$ is the jth column of $B^{(i)}, i\in [l]$ and the right-hand side part is

$$\begin{aligned}{} & {} \begin{bmatrix} \begin{bmatrix} A^{(1)} &{} &{} \\ &{}\quad \ddots &{} \\ &{} &{}\quad A^{(1)} \end{bmatrix}_k &{} \begin{bmatrix} A^{(l)} &{} &{} \\ &{}\quad \ddots &{} \\ &{} &{}\quad A^{(l)} \end{bmatrix}_k &{} \cdots &{} \begin{bmatrix} A^{(2)} &{} &{} \\ &{}\quad \ddots &{} \\ &{} &{}\quad A^{(2)} \end{bmatrix}_k\\ \begin{bmatrix} A^{(2)} &{} &{} \\ &{}\quad \ddots &{} \\ &{} &{}\quad A^{(2)} \end{bmatrix}_k &{} \begin{bmatrix} A^{(1)} &{} &{} \\ &{}\quad \ddots &{} \\ &{} &{}\quad A^{(1)} \end{bmatrix}_k &{} \cdots &{} \begin{bmatrix} A^{(3)} &{} &{} \\ &{}\quad \ddots &{} \\ &{} &{}\quad A^{(3)} \end{bmatrix}_k\\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ \begin{bmatrix} A^{(l)} &{} &{} \\ &{}\quad \ddots &{} \\ &{} &{}\quad A^{(l)} \end{bmatrix}_k &{} \begin{bmatrix} A^{(l-1)} &{} &{} \\ &{}\quad \ddots &{} \\ &{} &{}\quad A^{(l-1)} \end{bmatrix}_k &{} \cdots &{} \begin{bmatrix} A^{(1)} &{} &{} \\ &{}\quad \ddots &{} \\ &{} &{}\quad A^{(1)} \end{bmatrix}_k \end{bmatrix}_l \cdot \begin{bmatrix} \begin{bmatrix} (B^{(1)})_{:1}\\ \vdots \\ (B^{(1)})_{:k} \end{bmatrix}\\ \begin{bmatrix} (B^{(2)})_{:1}\\ \vdots \\ (B^{(2)})_{:k} \end{bmatrix}\\ \vdots \\ \begin{bmatrix} (B^{(l)})_{:1}\\ \vdots \\ (B^{(l)})_{:k} \end{bmatrix}\\ \end{bmatrix}. \end{aligned}$$

We observe that the $(q,1)-$th block of partitioned matrice on the left-hand side is

$$\begin{aligned} \sum \nolimits _{i=1}^{l}A^{(i)}B^{(h_i)}_{:j}, \qquad q = (p-1)k+j\in [kl], ~ j\in [k],~p\in [l], \end{aligned}$$

(A3)

where

$$\begin{aligned} h_i = {\left\{ \begin{array}{ll} l+p+1-i,&{}\quad i> p, \\ p+1-i,&{}\quad i\leqslant p. \end{array}\right. } \end{aligned}$$

While the $(q,1)-$th block of partitioned matrice on the right-hand side is $\sum \nolimits _{i=1}^{l}A^{(h_i)}B^{(i)}_{:j},$ which is equal to (A3).

Lemma 12

[57] Let $C\in \mathbb R^{m\times n} ,X\in \mathbb R^{n\times p} ,B\in \mathbb R^{k\times p} $. Then

$$\begin{aligned} Y = CXB^{\top }\Leftrightarrow {\text {vec}}({Y}) = (B\otimes C){\text {vec}}({X}). \end{aligned}$$

Lemma 13

Let $\mathcal {A}\in \mathbb {R}^{m\times n \times l},\mathcal {B}\in \mathbb {R}^{n\times k\times l},\mathcal {C}\in \mathbb {R}^{m\times k\times l}$. Then

$$\begin{aligned} \mathcal {C} = \mathcal {A}*\mathcal {B}\Leftrightarrow {\text {vec}}(\mathcal {C}) = (\widetilde{{\text {bcirc}}}(\mathcal {B})^{\top }\otimes I_m){\text {vec}}(\mathcal {A})=([I_k]_{l\times l}\circledast {\text {bcirc}}(\mathcal {A}) ){\text {vec}}(\mathcal {B}). \end{aligned}$$

Proof

We observe that ${\text {vec}}(\mathcal {C}) = {\text {vec}}([C^{(1)},\cdots ,C^{(l)}])$. Since ${\text {unfold}}(\mathcal {C}) = {\text {bcirc}}(\mathcal {A}){\text {unfold}}(\mathcal {B}),$ i.e., $[C^{(1)},\cdots ,C^{(l)}] = [A^{(1)},\cdots ,A^{(l)}]\widetilde{{\text {bcirc}}}(\mathcal {B})$, we have

$$\begin{aligned} {\text {vec}}(\mathcal {C})= & {} {\text {vec}}([C^{(1)},\cdots ,C^{(l)}])\\= & {} {\text {vec}}([A^{(1)},\cdots ,A^{(l)}]\widetilde{{\text {bcirc}}}(\mathcal {B}))\\= & {} (\widetilde{{\text {bcirc}}}(\mathcal {B})^{\top }\otimes I_m){\text {vec}}([A^{(1)},\cdots ,A^{(l)}])\\= & {} (\widetilde{{\text {bcirc}}}(\mathcal {B})^{\top }\otimes I_m){\text {vec}}(\mathcal {A}), \end{aligned}$$

where the third equation comes from Lemma 12. Similarly, by lemma 11, there holds

$$\begin{aligned} {\text {vec}}(\mathcal {C})= & {} {\text {vec}}([C^{(1)},\cdots ,C^{(l)}])\\= & {} {\text {vec}}([A^{(1)},\cdots ,A^{(l)}]\widetilde{{\text {bcirc}}}(\mathcal {B}))\\= & {} (I_{kl}\otimes [A^{(1)},\cdots ,A^{(l)}]){\text {vec}}(\widetilde{{\text {bcirc}}}(\mathcal {B}))\\= & {} ([I_k]_{l\times l}\circledast {\text {bcirc}}(\mathcal {A}) ){\text {vec}}(\mathcal {B}), \end{aligned}$$

where the third equation follows from Lemma 12.

Proof

Applying lemma 13, the tensor Sylvester equation (36) can be rewritten in the form

$$\begin{aligned} {\text {vec}}(\mathcal {C}) = \left( \widetilde{{\text {bcirc}}}(\mathcal {B})^{\top }\otimes I_k+[I_k]_{l\times l}\circledast {\text {bcirc}}(\mathcal {A}) \right) {\text {vec}}(\mathcal {X}). \end{aligned}$$

(A4)

D The Euclidean Gradient ${\text {grad}}f(\mathcal {X} )$ and the Euclidean directional derivative $Df(\mathcal {X} )[\mathcal {H} ]$ in Sect. 3.2

Similar to [13, Def. 4], for third-order tensor $\mathcal {X}\in \mathbb R^{n\times p \times l} $, we can also introduce the definition of the Euclidean gradient ${\text {grad}}f(\mathcal {X} )$ and the Euclidean Hessian ${\text {Hess}}f(\mathcal {X} )$ from the Fréchet differentiable.

Definition 22

Let $f: \mathcal {U} \subseteq \mathbb {R}^{n \times p \times l} \rightarrow \mathbb {R}$ be a continuous map. Then, we say f is t-differentiable at $\mathcal {X} \in \mathcal {U} $ if and only if there exists a third-order tensor ${\text {grad}}f(\mathcal {X})\in \mathbb R^{n\times p \times l} $ such that

$$\begin{aligned} \lim _{\mathcal {H} \rightarrow \mathcal {O}} \frac{\left\| f(\mathcal {X}+\mathcal {H})-f(\mathcal {X})-\left\langle {\text {grad}}f(\mathcal {X}), \mathcal {H}\right\rangle \right\| _{\textrm{F}}}{\Vert \mathcal {H}\Vert _{\textrm{F}}}=0, \end{aligned}$$

where ${\text {grad}}f(\mathcal {X})$ is called the gradient of f at $\mathcal {X}$ and $Df(\mathcal {X} )[\mathcal {H} ]= \left\langle {\text {grad}}f(\mathcal {X}), \mathcal {H}\right\rangle $ called the directional derivative of f at $\mathcal {X}$ along $\mathcal {H}$. And we say f is twice t-differentiable at $\mathcal {X} \in U$ if and only if f is continuously t-differentiable and there exists a bounded linear operator ${\text {Hess}}f(\mathcal {X}):\mathbb R^{n\times p \times l} \rightarrow \mathbb R^{n\times p \times l} $ such that

$$\begin{aligned} \lim _{\mathcal {H} \rightarrow \mathcal {O}} \frac{\left\| {\text {grad}}f(\mathcal {X}+\mathcal {H})-{\text {grad}}f(\mathcal {X})-{\text {Hess}}f(\mathcal {X})[\mathcal {H} ]\right\| _{\textrm{F}}}{\Vert \mathcal {H}\Vert _{\textrm{F}}}=0. \end{aligned}$$

Furthermore, we say f is t-differentiable (twice t-differentiable) on $\mathcal {U} $ if and only if f is t-differentiable (twice t-differentiable) at every $\mathcal {X} \in \mathcal {U} $.

Theorem 21

Let f be a continuous map from $\mathcal {U} \subseteq \mathbb {R}^{n \times p \times l}$ to $\mathbb {R}$. Thenf is t-differentiable on U if and only if $\frac{\mathrm {\partial } f(\mathcal {X})}{\partial [{\text {vec}}(\mathcal {X})]}$ exists for every $\mathcal {X} \in \mathcal {U} $, where $\frac{\partial f(\mathcal {X})}{\partial [{\text {vec}}(\mathcal {X})]}$ is a vector in $\mathbb {R}^{npl}$ with $\left( \frac{\partial f(\mathcal {X})}{\partial [{\text {vec}}(\mathcal {X})]}\right) _{i}=\frac{\partial f(\mathcal {X})}{\partial \left( [{\text {vec}}(\mathcal {X})]_{i}\right) }$ for any $i \in [npl]$. Especially, for any $\mathcal {X} \in \mathcal {U} ,$

$$\begin{aligned} {\text {grad}}f(\mathcal {X})={\text {vec}}^{-1} \left( \frac{\partial f(\mathcal {X})}{\partial [{\text {vec}}(\mathcal {X})]} \right) , \end{aligned}$$

(A5)

where $\varvec{v}=\textrm{vec}(\mathcal {A} )$ denotes the vectorized tensor of $\mathcal {A} $ and $\textrm{vec}^{-1}(\varvec{v})=\mathcal {A} $ represents the operator that converts a vector $\varvec{v}$ back to a tensor $\mathcal {A} $, which can all be implemented with functions reshape, permute and ipermute of Matlab (cf. [58]).

Proof

The proof is similar to that of [13, Thm. 1] and is omitted.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mao, XP., Wang, Y. & Yang, YN. Computation over t-Product Based Tensor Stiefel Manifold: A Preliminary Study. J. Oper. Res. Soc. China (2024). https://doi.org/10.1007/s40305-023-00522-z

Download citation

Received: 14 December 2022
Revised: 18 September 2023
Accepted: 23 October 2023
Published: 09 January 2024
DOI: https://doi.org/10.1007/s40305-023-00522-z

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computation over t-Product Based Tensor Stiefel Manifold: A Preliminary Study

Abstract

Access this article

Similar content being viewed by others

Intrinsic representation of tangent vectors and vector transports on matrix manifolds

Polar Decomposition-based Algorithms on the Product of Stiefel Manifolds with Applications in Tensor Approximation

On matrix exponentials and their approximations related to optimization on the Stiefel manifold

Data Availability

Notes

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Appendices

Appendix

A Preliminaries on Riemannian Manifold

Definition 13

Definition 14

Definition 15

Definition 16

Definition 17

Definition 18

Lemma 8

Definition 19

Lemma 9

Definition 20

Lemma 10

Definition 21

B Proofs of Theorems, Propositions and Lemmas in Sect. 2

1.1 B.1 Proof of Proposition 2

Proof

1.2 B.2 Proof of Theorem 3

Proof

1.3 B.3 Proof of Lemma 3

Proof

1.4 B.4 Proof of Theorem 4

Proof

Remark 16

1.5 B.5 Proof of Proposition 6

Proof

1.6 B.6 Proof of Proposition 7

Proof

1.7 B.7 Proof of Theorem 5

Proof

1.8 B.8 Proof of the Well-Defined Property of (10)

Proof

1.9 B.9 Proof of Equivalence of (9) and (11)

Proof

1.10 B.10 Proof of Proposition 8

Proof

1.11 B.11 Proof of Proposition 9

Proof

1.12 B.12 Proof of Proposition 10

Proof

1.13 B.13 Proof of Proposition 11

Proof

1.14 B.14 Proof of Proposition 12

Proof

1.15 B.15 Proof of Proposition 13

Proof

C Proofs of the Analytical Solution of the t-Sylvester Equation in Theorem 17

Lemma 11

Proof

Lemma 12

Lemma 13

Proof

Proof

D The Euclidean Gradient \({\text {grad}}f(\mathcal {X} )\) and the Euclidean directional derivative \(Df(\mathcal {X} )[\mathcal {H} ]\) in Sect. 3.2

Definition 22

Theorem 21

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords