Skip to main content

Advertisement

Log in

Characterizing Spatiotemporal Transcriptome of the Human Brain Via Low-Rank Tensor Decomposition

  • Published:
Statistics in Biosciences Aims and scope Submit manuscript

Abstract

Spatiotemporal gene expression data of the human brain offer insights on the spatial and temporal patterns of gene regulation during brain development. Most existing methods for analyzing these data consider spatial and temporal profiles separately, with the implicit assumption that different brain regions develop in similar trajectories, and that the spatial patterns of gene expression remain similar at different time points. Although these analyses may help delineate gene regulation either spatially or temporally, they are not able to characterize heterogeneity in temporal dynamics across different brain regions, or the evolution of spatial patterns of gene regulation over time. In this article, we develop a statistical method based on low-rank tensor decomposition to more effectively analyze spatiotemporal gene expression data. We generalize the classical principal component analysis (PCA), which is applicable only to data matrices, to tensor PCA that can simultaneously capture spatial and temporal effects. We also propose an efficient algorithm that combines tensor unfolding and power iteration to estimate the tensor principal components efficiently, and provide guarantees on their statistical performance. Numerical experiments are presented to further demonstrate the merits of the proposed method. An application our method to a spatiotemporal brain expression data provides insights on gene regulation patterns in the brain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Alter O, Brown P, Botstein D (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci 97:10101–10106

    Article  Google Scholar 

  2. Coyle JT, Price DL, Delong MR (1983) Alzheimer’s disease: a disorder of cortical cholinergic innervation. Science 219 (4589):1184–1190

    Article  Google Scholar 

  3. De Lathauwer L, De Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21 (4):1253–1278

    Article  MathSciNet  Google Scholar 

  4. de Silva V, Lim LH (2008) Tensor rank and the ill-posedness of the best low-rank approximation problem. SIAM J Matrix Anal Appl 30 (3):1084–1127

    Article  MathSciNet  Google Scholar 

  5. Donoso M, Collins AG, Koechlin E (2014) Foundations of human reasoning in the prefrontal cortex. Science 344 (6191):1481–1486

    Article  Google Scholar 

  6. Fjell AM, Westlye LT, Amlien I, Espeseth T, Reinvang I, Raz N, Agartz I, Salat DH, Greve DN, Fischl B et al (2009) High consistency of regional cortical thinning in aging across multiple samples. Cereb cortex 19:2001–2012

    Article  Google Scholar 

  7. Gutchess AH, Kensinger EA, Schacter DL (2007) Aging, self-referencing, and medial prefrontal cortex. Soc Neurosci 2 (2):117–133

    Article  Google Scholar 

  8. Hawrylycz M, Miller JA, Menon V, Feng D, Dolbeare T, Guillozet-Bongaarts AL, Jegga AG, Aronow BJ, Lee CK, Bernard A et al (2015) Canonical genetic signatures of the adult human brain. Nat Neurosci 18 (12):1832

    Article  Google Scholar 

  9. Hillar C, Lim L (2013) Most tensor problems are np-hard. J ACM 60 (6):45

    Article  MathSciNet  Google Scholar 

  10. Jolliffe I (2002) Principal component analysis. Springer, Berlin

    MATH  Google Scholar 

  11. Kandel ER, Schwartz JH, Jessell TM et al (2000) Principles of neural science, vol 4. McGraw-Hill, New York

    Google Scholar 

  12. Kang HJ, Kawasawa YI, Cheng F, Zhu Y, Xu X, Li M, Sousa AM, Pletikos M, Meyer KA, Sedmak G et al (2011) Spatio-temporal transcriptome of the human brain. Nature 478 (7370):483–489

    Article  Google Scholar 

  13. Kato T (1982) A short introduction to perturbation theory for linear operators. Springer, New York

    Book  Google Scholar 

  14. Koldar TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51:455–500

    Article  MathSciNet  Google Scholar 

  15. Koltchinskii V, Lounici K (2014) Asymptotics and concentration bounds for bilinear forms of spectral projectors of sample covariance. arXiv:14084643

  16. Landel V, Baranger K, Virard I, Loriod B, Khrestchatisky M, Rivera S, Benech P, Féron F (2014) Temporal gene profiling of the 5xfad transgenic mouse model highlights the importance of microglial activation in Alzheimer’s disease. Mol Neurodegener 9 (1):1–18

    Article  Google Scholar 

  17. Laurent B, Massart P (1998) Adaptive estimation of a quadratic functional by model selection. Ann Stat 28 (5):1303–1338

    MathSciNet  Google Scholar 

  18. Lauria G, Holland N, Hauer P, Cornblath DR, Griffin JW, McArthur JC (1999) Epidermal innervation: changes with aging, topographic location, and in sensory neuropathy. J Neurol Sci 164 (2):172–178

    Article  Google Scholar 

  19. Lein ES, Hawrylycz MJ, Ao N, Ayres M, Bensinger A, Bernard A, Boe AF, Boguski MS, Brockway KS, Byrnes EJ et al (2007) Genome-wide atlas of gene expression in the adult mouse brain. Nature 445 (7124):168–176

    Article  Google Scholar 

  20. Lidskii V (1950) The proper values of the sum and product of symmetric matrices. Dokl Akad Nauk SSSR 75:769–772

    Google Scholar 

  21. Lin Z, Sanders SJ, Li M, Sestan N, State MW, Zhao H (2015) A markov random field-based approach to characterizing human brain development using spatial-temporal transcriptome data. Ann Appl Stat 9 (1):429

    Article  MathSciNet  Google Scholar 

  22. Liu T, Lee KY, Zhao H (2016) Ultrahigh dimensional feature selection via kernel canonical correlation analysis. arXiv:160407354

  23. Luebke J, Chang YM, Moore T, Rosene D (2004) Normal aging results in decreased synaptic excitation and increased synaptic inhibition of layer 2/3 pyramidal cells in the monkey prefrontal cortex. Neuroscience 125 (1):277–288

    Article  Google Scholar 

  24. Miller JA, Ding SL, Sunkin SM, Smith KA, Ng L, Szafer A, Ebbert A, Riley ZL, Royall JJ, Aiona K et al (2014) Transcriptional landscape of the prenatal human brain. Nature 508 (7495):199–206

    Article  Google Scholar 

  25. Montanari A, Richard E (2014) A statistical model for tensor pca. NIPS

  26. Muirhead RJ (2009) Aspects of multivariate statistical theory, vol 197. Wiley, Hoboken

    MATH  Google Scholar 

  27. Nevalainen P, Lauronen L, Pihko E (2014) Development of human somatosensory cortical functions—what have we learned from magnetoencephalography: a review. Front Hum Neurosci 8:158

    Article  Google Scholar 

  28. Pardo JV, Lee JT, Sheikh SA, Surerus-Johnson C, Shah H, Munch KR, Carlis JV, Lewis SM, Kuskowski MA, Dysken MW (2007) Where the brain grows old: decline in anterior cingulate and medial prefrontal function with normal aging. Neuroimage 35 (3):1231–1237

    Article  Google Scholar 

  29. Parikshak NN, Luo R, Zhang A, Won H, Lowe JK, Chandran V, Horvath S, Geschwind DH (2013) Integrative functional genomic analyses implicate specific molecular pathways and circuits in autism. Cell 155 (5):1008–1021

    Article  Google Scholar 

  30. Pletikos M, Sousa AM, Sedmak G, Meyer KA, Zhu Y, Cheng F, Li M, Kawasawa YI, Šestan N (2014) Temporal specification and bilaterality of human neocortical topographic gene expression. Neuron 81 (2):321–332

    Article  Google Scholar 

  31. Raz N, Lindenberger U, Rodrigue KM, Kennedy KM, Head D, Williamson A, Dahle C, Gerstorf D, Acker JD (2005) Regional brain changes in aging healthy adults: general trends, individual differences and modifiers. Cereb Cortex 15 (11):1676–1689

    Article  Google Scholar 

  32. Sabatinelli D, Bradley MM, Fitzsimmons JR, Lang PJ (2005) Parallel amygdala and inferotemporal activation reflect emotional intensity and fear relevance. Neuroimage 24 (4):1265–1270

    Article  Google Scholar 

  33. Sheehan BN, Saad Y (2007) Higher order orthogonal iteration of tensors (hooi) and its relation to pca and glram. In: Proceedings of the 2007 SIAM International Conference on Data Mining, SIAM, pp 355–365

  34. Vershynin R (2012) Introduction to the non-asymptotic analysis of random matrices. In: Compressed Sensing. Cambridge University Press, Cambridge pp 210–268

  35. Wall M, Dyck P, Brettin T (2001) Singular value decomposition analysis of microarray data. Bioinformatics 17:566–568

    Article  Google Scholar 

  36. Wedin P (1972) Perturbation bounds in connection with singular value decomposition. BIT Num Math 12 (1):99–111

    Article  MathSciNet  Google Scholar 

  37. Wen X, Fuhrman S, Michaels GS, Carr DB, Smith S, Barker JL, Somogyi R (1998) Large-scale temporal gene expression mapping of central nervous system development. Proc Natl Acad Sci 95 (1):334–339

    Article  Google Scholar 

  38. Yeung KY, Ruzzo WL (2001) Principal component analysis for clustering gene expression data. Bioinformatics 17 (9):763–774

    Article  Google Scholar 

  39. Zhang T, Golub GH (2001) Rank-one approximation to high order tensors. SIAM J Matrix Anal Appl 23 (2):534–550

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tianqi Liu.

Appendices

Appendix: Proofs

Proof

(Proof of Theorem 1) Write

$$\begin{aligned} {\varvec {T}}=\sqrt{d_G}\sum _{k=1}^r \lambda _k \left ( {\varvec {u}}_k\otimes {\varvec {v}}_k\otimes {\varvec {w}}_k\right) . \end{aligned}$$

Then \({\varvec {X}}={\varvec {T}}+{\varvec {E}}\). Denote by

$$\begin{aligned} X_g= (x_{gst})_{1\le s\le d_S, 1\le t\le d_T}. \end{aligned}$$

Let \(T_g\), \(E_g\) be similarly defined. Then

$$\begin{aligned} \frac{1}{d_G}{\mathcal {M}} ({\varvec {X}})^\top {\mathcal {M}} ({\varvec {X}})= & {} \frac{1}{d_G}\sum _{g=1}^{d_G}\text {vec} (X_g)\otimes \text {vec} (X_g)\\= & {} {\mathcal {M}}\left ( \frac{1}{d_G}\sum _{g=1}^{d_G}X_g\otimes X_g\right) \\= & {} {\mathcal {M}}\left ( \frac{1}{d_G}\sum _{g=1}^{d_G} T_g\otimes T_g +\frac{1}{d_G}\sum _{g=1}^{d_G} E_g\otimes E_g +\frac{1}{d_G}\sum _{g=1}^{d_G} \left ( T_g\otimes E_g+E_g\otimes T_g\right) \right) . \end{aligned}$$

Hereafter, with slight abuse of notation, we use \({\mathcal {M}}\) to denote the matricization operator that collapses the first two, and remaining two indices of a fourth order tensor respectively. Observe that

$$\begin{aligned} T_g=\sqrt{d_G}\sum _{k=1}^r \lambda _k u_{kg} \left ( {\varvec {v}}_k\otimes {\varvec {w}}_k\right) . \end{aligned}$$

Therefore

$$\begin{aligned} T_g\otimes T_g = d_G\sum _{k_1,k_2=1}^r \lambda _{k_1}\lambda _{k_2} u_{k_1g}u_{k_2g}\left ( {\varvec {v}}_{k_1}\otimes {\varvec {w}}_{k_1}\otimes {\varvec {v}}_{k_2}\otimes {\varvec {w}}_{k_2}\right) . \end{aligned}$$

Because of the orthogonality among \({\varvec {u}}_k\)s, we get

$$\begin{aligned} \frac{1}{d_G}\sum _{g=1}^{d_G} T_g\otimes T_g=\sum _{k=1}^r \lambda _k^2 \left ( ({\varvec {v}}_{k}\otimes {\varvec {w}}_{k})\otimes ({\varvec {v}}_{k}\otimes {\varvec {w}}_{k})\right) . \end{aligned}$$

On the other hand, note that

$$\begin{aligned} {\mathcal {M}}\left ( {1\over d_G}\sum _{g=1}^{d_G} E_g\otimes E_g\right) ={1\over d_G}\sum _{g=1}^{d_G} \left ( \text {vec} (E_g)\otimes \text {vec} (E_g)\right) . \end{aligned}$$

In other words, \({\mathcal {M}} (d_G^{-1}\sum _{g=1}^{d_G} E_g\otimes E_g)\) is the sample covariance matrix of independent Gaussian vectors

$$\begin{aligned} \text {vec} (E_g)\sim N (0, I_{d_S\cdot d_T}), \qquad 1\le g\le d_G. \end{aligned}$$

Therefore, there exists an absolute constant \(C_1>0\) such that

$$\begin{aligned} \left\| {\mathcal {M}}\left ( {1\over d_G}\sum _{g=1}^{d_G} E_g\otimes E_g\right) -I_{d_S\cdot d_T}\right\| \le C_1\sigma ^2\sqrt{d_Sd_T\over d_G}. \end{aligned}$$

with probability tending to one as \(d_G\rightarrow \infty\). See, e.g., [34].

Finally, observe that

$$\begin{aligned} \sum _{g=1}^{d_G}T_g\otimes E_g=\sqrt{d_G}\sum _{k=1}^r \lambda _k \left[ {\varvec {v}}_k\otimes {\varvec {w}}_k\otimes \left ( \sum _{g=1}^{d_G}u_{kg}E_g\right) \right] =:\sqrt{d_G}\sum _{k=1}^r \lambda _k \left ( {\varvec {v}}_k\otimes {\varvec {w}}_k\otimes Z_k\right) . \end{aligned}$$

By the orthogonality of \({\varvec {u}}_k\)s, it is not hard to see that \(Z_k\)s are independent Gaussian matrices:

$$\begin{aligned} \text {vec} (Z_k)\sim N\left ( 0,\sigma ^2I_{d_S\cdot d_T}\right) , \end{aligned}$$

so that there exists an absolute constant \(C_2>0\) such that

$$\begin{aligned} \left\| {\mathcal {M}}\left ( {1\over d_G}\sum _{g=1}^{d_G} \left ( T_g\otimes E_g+E_g\otimes T_g\right) \right) \right\| \le {2\over d_G}\left\| {\mathcal {M}}\left ( \sum _{g=1}^{d_G} T_g\otimes E_g\right) \right\| \le C_2\lambda _1\sigma \sqrt{d_Sd_T\over d_G}, \end{aligned}$$

with probability tending to one.

To sum up, we get

$$\begin{aligned} \left\| {1\over d_G}{\mathcal {M}} ({\varvec {X}})^\top {\mathcal {M}} ({\varvec {X}})-A\right\| \le (C_1\sigma ^2+C_2\lambda _1\sigma )\sqrt{d_Sd_T\over d_G}. \end{aligned}$$

where

$$\begin{aligned} A=I_{d_S\cdot d_T}+\sum _{k=1}^r \lambda _k^2 \left[ \text {vec}\left ( {\varvec {v}}_k\otimes {\varvec {w}}_k\right) \otimes \text {vec}\left ( {\varvec {v}}_k\otimes {\varvec {w}}_k\right) \right] . \end{aligned}$$

It is clear that

$$\begin{aligned} \left\{ (1+\lambda _k^2, \text {vec} ({\varvec {v}}_k\otimes {\varvec {w}}_k)): 1\le k\le r\right\} \end{aligned}$$

are the leading eigenvalue-eigenvector pairs of A.

Recall that \(({\widehat{\lambda }}_k^2, {\widehat{{\varvec {h}}}}_k)\) is the kth eigenvalue-eigenvector pair of \({\mathcal {M}} ({\varvec {X}})^\top {\mathcal {M}} ({\varvec {X}})\). By Lidskii’s inequality,

$$\begin{aligned} |{\widehat{\lambda }}_k^2-\lambda _k^2|\le (C_1\sigma ^2+C_2\lambda _1\sigma )\sqrt{d_Sd_T\over d_G}. \end{aligned}$$

See, e.g., [13, 20]. Then

$$\begin{aligned} \Vert \text {vec}^{-1} ({\widehat{{\varvec {h}}}}_k)-{\varvec {v}}_k\otimes {\varvec {w}}_k\Vert ^2\le & {} \Vert \text {vec}^{-1} ({\widehat{{\varvec {h}}}}_k)-{\varvec {v}}_k\otimes {\varvec {w}}_k\Vert _\text{F}^2\\= & {} 2-2\langle {\widehat{{\varvec {h}}}}_k,\text {vec} ({\varvec {v}}_k\otimes {\varvec {w}}_k)\rangle \\\le & {} 2\left\| {\widehat{{\varvec {h}}}}_k\otimes {\widehat{{\varvec {h}}}}_k-\text {vec} ({\varvec {v}}_k\otimes {\varvec {w}}_k)\otimes \text {vec} ({\varvec {v}}_k\otimes {\varvec {w}}_k)\right\| \\\le & {} 8 (C_1\sigma ^2+C_2\lambda _1\sigma )g_k^{-1}\sqrt{d_Sd_T\over d_G}, \end{aligned}$$

where the last inequality follows from Lemma 1 from [15]. For large enough C, we can ensure that

$$\begin{aligned} \Vert \text {vec}^{-1} ({\widehat{{\varvec {h}}}}_k)-{\varvec {v}}_k\otimes {\varvec {w}}_k\Vert ^2\le \frac{C}{4} (\sigma ^2+\lambda _1\sigma )g_k^{-1}\sqrt{d_Sd_T\over d_G}\le {1\over 4}. \end{aligned}$$

Recall also that \({\widehat{{\varvec {v}}}}_k\) and \({\widehat{{\varvec {w}}}}_k\) be the leading singular vectors of \(\text {vec}^{-1} ({\widehat{{\varvec {h}}}}_k)\). By Wedin’s perturbation theorem, we obtain immediately that

$$\begin{aligned} \max \left\{ 1-|\langle {\widehat{{\varvec {v}}}}_k, {\varvec {v}}_k\rangle |, 1-|\langle {\widehat{{\varvec {w}}}}_k,{\varvec {w}}_k\rangle |\right\} \le C (\sigma ^2+\lambda _1\sigma )\sigma ^2g_k^{-1}\sqrt{d_Sd_T\over d_G}. \end{aligned}$$

See, e.g., [25, 36]. □

Proof

(Proof of Theorem 2) Denote by

$$\begin{aligned} {\tilde{{\varvec {b}}}}= \left ( {1\over d_G}\sum _{g=1}^{d_G}X_g\otimes X_g\right) \times _2 {\varvec {c}}^{[m-1]}\times _3 {\varvec {c}}^{[m-1]}\times _4 {\varvec {b}}^{[m-1]}-\sigma ^2{\varvec {b}}^{[m-1]}. \end{aligned}$$

It is not hard to see that

$$\begin{aligned} {\varvec {b}}^{[m]}={\tilde{{\varvec {b}}}}/\Vert {\tilde{{\varvec {b}}}}\Vert . \end{aligned}$$

Let \({\mathcal {M}}^{-1}\) be the inverse of the matricization operator \({\mathcal {M}}\) that unfold a fourth order tensor into matrices, that is, \({\mathcal {M}}^{-1}\) reshapes a \((d_Sd_T)\times (d_Sd_T)\) matrix into a fourth order tensor of size \(d_S\times d_T\times d_S\times d_T\). Observe that

$$\begin{aligned} \frac{1}{d_G}\sum _{g=1}^{d_G}X_g\otimes X_g= & {} \frac{1}{d_G}\sum _{g=1}^{d_G} T_g\otimes T_g +\frac{1}{d_G}\sum _{g=1}^{d_G} E_g\otimes E_g +\frac{1}{d_G}\sum _{g=1}^{d_G} \left ( T_g\otimes E_g+E_g\otimes T_g\right) \\= & {} \lambda _k^2 \left ( ({\varvec {v}}_{k}\otimes {\varvec {w}}_{k})\otimes ({\varvec {v}}_{k}\otimes {\varvec {w}}_{k})\right) +\sum _{j\ne k} \lambda _j^2 \left ( ({\varvec {v}}_{j}\otimes {\varvec {w}}_{j})\otimes ({\varvec {v}}_{j}\otimes {\varvec {w}}_{j})\right) \\&+\sigma ^2{\mathcal {M}}^{-1} (I_{d_S\cdot d_T})+\left ( \frac{1}{d_G}\sum _{g=1}^{d_G} E_g\otimes E_g -{\mathcal {M}}^{-1} (I_{d_S\cdot d_T})\right) \\&+\frac{1}{d_G}\sum _{g=1}^{d_G} \left ( T_g\otimes E_g+E_g\otimes T_g\right) \\=: & {} \lambda _k^2 \left ( ({\varvec {v}}_{k}\otimes {\varvec {w}}_{k})\otimes ({\varvec {v}}_{k}\otimes {\varvec {w}}_{k})\right) +\varDelta _1+\sigma ^2{\mathcal {M}}^{-1} (I_{d_S\cdot d_T})+\varDelta _2+\varDelta _3. \end{aligned}$$

We get

$$\begin{aligned} {\tilde{{\varvec {b}}}}=\lambda _k^2 \langle {\varvec {b}}^{[m-1]},{\varvec {v}}_k\rangle \langle {\varvec {c}}^{[m-1]},{\varvec {w}}_k\rangle ^2{\varvec {v}}_k+ (\varDelta _1+\varDelta _2+\varDelta _3)\times _2 {\varvec {c}}^{[m-1]}\times _3 {\varvec {c}}^{[m-1]}\times _4 {\varvec {b}}^{[m-1]}, \end{aligned}$$

where we used the fact that

$$\begin{aligned} {\mathcal {M}}^{-1} (I_{d_S\cdot d_T})\times _2 {\varvec {c}}^{[m-1]}\times _3 {\varvec {c}}^{[m-1]}\times _4 {\varvec {b}}^{[m-1]}={\varvec {b}}^{[m-1]}. \end{aligned}$$

Therefore

$$\begin{aligned} |\langle {\tilde{{\varvec {b}}}},{\varvec {v}}_k\rangle |= & {} \left| \lambda _k^2 \langle {\varvec {b}}^{[m-1]},{\varvec {v}}_k\rangle \langle {\varvec {c}}^{[m-1]},{\varvec {w}}_k\rangle ^2+\langle \varDelta _1+\varDelta _2+\varDelta _3, {\varvec {v}}_k\otimes {\varvec {c}}^{[m-1]}\otimes {\varvec {c}}^{[m-1]}\otimes {\varvec {b}}^{[m-1]}\rangle \right| \\= & {} \lambda _k^2 |\langle {\varvec {b}}^{[m-1]},{\varvec {v}}_k\rangle |\langle {\varvec {c}}^{[m-1]},{\varvec {w}}_k\rangle ^2+\left| \langle \varDelta _2+\varDelta _3, {\varvec {v}}_k\otimes {\varvec {c}}^{[m-1]}\otimes {\varvec {c}}^{[m-1]}\otimes {\varvec {b}}^{[m-1]}\rangle \right| \\\ge & {} \lambda _k^2 |\langle {\varvec {b}}^{[m-1]},{\varvec {v}}_k\rangle |\langle {\varvec {c}}^{[m-1]},{\varvec {w}}_k\rangle ^2-\Vert \varDelta _2+\varDelta _3\Vert . \end{aligned}$$

Denote by

$$\begin{aligned} \tau _m=\min \left\{ |\langle {\varvec {b}}^{[m]},{\varvec {v}}_k\rangle |, |\langle {\varvec {c}}^{[m]},{\varvec {w}}_k\rangle |\right\} . \end{aligned}$$

Then,

$$\begin{aligned} \left| \langle {\tilde{{\varvec {b}}}},{\varvec {v}}_k\rangle \right| \ge \lambda _k^2\tau _{m-1}^3-\Vert \varDelta _2+\varDelta _3\Vert . \end{aligned}$$

On the other hand, note that

$$\begin{aligned} \Vert {\tilde{{\varvec {b}}}}\Vert =\langle {\tilde{{\varvec {b}}}}, {\varvec {b}}^{[m]}\rangle\le & {} \lambda _k^2 \langle {\varvec {b}}^{[m-1]},{\varvec {v}}_k\rangle \langle {\varvec {c}}^{[m-1]},{\varvec {w}}_k\rangle ^2\langle {\varvec {v}}_k,{\varvec {b}}^{[m]}\rangle \\&+\left\langle \varDelta _1+\varDelta _2+\varDelta _3, {\varvec {b}}^{[m]}\otimes {\varvec {c}}^{[m-1]}\otimes {\varvec {c}}^{[m-1]}\otimes {\varvec {b}}^{[m-1]}\right\rangle . \end{aligned}$$

Write

$$\begin{aligned} P_{{\varvec {v}}_k}^{\perp }=I_{d_S}-{\varvec {v}}_k\otimes {\varvec {v}}_k, \quad \text{and}\quad P_{{\varvec {w}}_k}^\perp = (I_{d_T}-{\varvec {w}}_k\otimes {\varvec {w}}_k). \end{aligned}$$

Then

$$\begin{aligned} \Vert {\tilde{{\varvec {b}}}}\Vert= & {} \lambda _k^2 \langle {\varvec {b}}^{[m-1]},{\varvec {v}}_k\rangle \langle {\varvec {c}}^{[m-1]},{\varvec {w}}_k\rangle ^2\langle {\varvec {v}}_k,{\varvec {b}}^{[m]}\rangle \\&+\langle \varDelta _1, P_{{\varvec {v}}_k}^{\perp }{\varvec {b}}^{[m]}\otimes P_{{\varvec {w}}_k}^\perp {\varvec {c}}^{[m-1]}\otimes P_{{\varvec {w}}_k}^\perp {\varvec {c}}^{[m-1]}\otimes P_{{\varvec {v}}_k}^{\perp }{\varvec {b}}^{[m-1]}\rangle \\&+\langle \varDelta _2+\varDelta _3, {\varvec {b}}^{[m]}\otimes {\varvec {c}}^{[m-1]}\otimes {\varvec {c}}^{[m-1]}\otimes {\varvec {b}}^{[m-1]}\rangle \\\le & {} \lambda _k^2 \langle {\varvec {b}}^{[m-1]},{\varvec {v}}_k\rangle \langle {\varvec {c}}^{[m-1]},{\varvec {w}}_k\rangle ^2\langle {\varvec {v}}_k,{\varvec {b}}^{[m]}\rangle \\&+\lambda _1^2\left ( 1-\langle {\varvec {v}}_k,{\varvec {b}}^{[m]}\rangle ^2\right) ^{1/2}\left ( 1-\langle {\varvec {v}}_k,{\varvec {b}}^{[m-1]}\rangle ^2\right) ^{1/2}\left ( 1-\langle {\varvec {w}}_k,{\varvec {c}}^{[m-1]}\rangle ^2\right) +\Vert \varDelta _2+\varDelta _3\Vert \\\le & {} \lambda _k^2 |\langle {\varvec {b}}^{[m-1]},{\varvec {v}}_k\rangle |\langle {\varvec {c}}^{[m-1]},{\varvec {w}}_k\rangle ^2\\&+\lambda _1^2\left ( 1-\langle {\varvec {v}}_k,{\varvec {b}}^{[m]}\rangle ^2\right) ^{1/2}\left ( 1-\langle {\varvec {v}}_k,{\varvec {b}}^{[m-1]}\rangle ^2\right) ^{1/2}\left ( 1-\langle {\varvec {w}}_k,{\varvec {c}}^{[m-1]}\rangle ^2\right) +\Vert \varDelta _2+\varDelta _3\Vert \\\le & {} \lambda _k^2\tau _{m-1}^3+\lambda _1^2\left ( 1-\tau _{m-1}^2\right) ^{3/2}\left ( 1-\langle {\varvec {v}}_k,{\varvec {b}}^{[m]}\rangle ^2\right) ^{1/2}+\Vert \varDelta _2+\varDelta _3\Vert . \end{aligned}$$

Therefore,

$$\begin{aligned} |\langle {\varvec {b}}^{[m]},{\varvec {v}}_k\rangle |= & {} |\langle {\tilde{{\varvec {b}}}},{\varvec {v}}_k\rangle |/\Vert {\tilde{{\varvec {b}}}}\Vert \\\ge & {} 1-\left ( \lambda _k^2\tau _{m-1}^3\right) ^{-1}\left[ \lambda _1^2\left ( 1-\tau _{m-1}^2\right) ^{3/2}\left ( 1-\langle {\varvec {v}}_k,{\varvec {b}}^{[m]}\rangle ^2\right) ^{1/2}\right] \\&-\left ( \lambda _k^2\tau _{m-1}^3\right) ^{-1}\Vert \varDelta _2+\varDelta _3\Vert \\\ge & {} 1-4\left ( \lambda _k^2\tau _{m-1}^3\right) ^{-1}\left[ \lambda _1^2\left ( 1-\tau _{m-1}\right) ^{3/2}\left ( 1-|\langle {\varvec {v}}_k,{\varvec {b}}^{[m]}\rangle |\right) ^{1/2}\right] \\&-\left ( \lambda _k^2\tau _{m-1}^3\right) ^{-1}\Vert \varDelta _2+\varDelta _3\Vert \\\ge & {} 1-\max \biggl \{8\left ( \lambda _k^2\tau _{m-1}^3\right) ^{-1}\left[ \lambda _1^2\left ( 1-\tau _{m-1}\right) ^{3/2}\left ( 1-|\langle {\varvec {v}}_k,{\varvec {b}}^{[m]}\rangle |\right) ^{1/2}\right] , \\&2\left ( \lambda _k^2\tau _{m-1}^3\right) ^{-1}\Vert \varDelta _2+\varDelta _3\Vert \biggr \}\\\ge & {} 1- \max \left\{ 64\left ( \lambda _k^2\tau _{m-1}^3\right) ^{-2}\lambda _1^4\left ( 1-\tau _{m-1}\right) ^3,2\left ( \lambda _k^2\tau _{m-1}^3\right) ^{-1}\Vert \varDelta _2+\varDelta _3\Vert \right\} . \end{aligned}$$

Assume that

$$\begin{aligned} \tau _{m-1}\ge \max \left\{ 1-\frac{1}{64}\left ( \frac{\lambda _k}{\lambda _1}\right) ^2,\frac{1}{2}\right\} , \end{aligned}$$
(9)

which we shall verify later. Then

$$\begin{aligned} 1-|\langle {\varvec {b}}^{[m]},{\varvec {v}}_k\rangle |\le \max \left\{ \frac{1}{2}\left ( 1-\tau _{m-1}\right) ,16\lambda _k^{-2}\Vert \varDelta _2+\varDelta _3\Vert \right\} . \end{aligned}$$
(10)

Similarly, we can show that

$$\begin{aligned} 1-|\langle {\varvec {c}}^{[m]},{\varvec {w}}_k\rangle |\le \max \left\{ \frac{1}{2}\left ( 1-\tau _{m-1}\right) , 16\lambda _k^{-2}\Vert \varDelta _2+\varDelta _3\Vert \right\} . \end{aligned}$$

Together, they imply that

$$\begin{aligned} 1-\tau _m\le \max \left\{ \frac{1}{2}\left ( 1-\tau _{m-1}\right) , 16\lambda _k^{-2}\Vert \varDelta _2+\varDelta _3\Vert \right\} . \end{aligned}$$
(11)

It is clear from (11) that if

$$\begin{aligned} 1-\tau _{m-1}\le 16\lambda _k^{-2}\Vert \varDelta _2+\varDelta _3\Vert , \end{aligned}$$
(12)

so is \(1-\tau _{m}\). Thus (12) holds for any

$$\begin{aligned} m\ge -\log _2\left ( \frac{16}{1-\tau _0}\lambda _k^{-2}\Vert \varDelta _2+\varDelta _3\Vert \right) . \end{aligned}$$

We now derive bounds for \(\Vert \varDelta _2+\varDelta _3\Vert\). By triangular inequality \(\Vert \varDelta _2+\varDelta _3\Vert \le \Vert \varDelta _2\Vert +\Vert \varDelta _3\Vert\). By Lemma 1,

$$\begin{aligned} \Vert \varDelta _2\Vert \le 6\sigma ^2\sqrt{\frac{d_S+d_T}{d_G}}. \end{aligned}$$

Next we consider bounding \(\Vert \varDelta _3\Vert\). Recall that

$$\begin{aligned} \varDelta _3=\frac{1}{d_G}\sum _{g=1}^{d_G}T_g\otimes E_g+\frac{1}{d_G}\sum _{g=1}^{d_G}E_g\otimes T_g. \end{aligned}$$

By triangular inequality,

$$\begin{aligned} \Vert \varDelta _3\Vert \le \left\| {1\over d_G}\sum _{g=1}^{d_G}T_g\otimes E_g\right\| +\left\| {1\over d_G}\sum _{g=1}^{d_G}E_g\otimes T_g\right\| ={2\over d_G}\left\| \sum _{g=1}^{d_G}T_g\otimes E_g\right\| . \end{aligned}$$

Note that

$$\begin{aligned} \sum _{g=1}^{d_G}T_g\otimes E_g=\sqrt{d_G}\sum _{k=1}^r \lambda _k \left[ {\varvec {v}}_k\otimes {\varvec {w}}_k\otimes \left ( \sum _{g=1}^{d_G}u_{kg}E_g\right) \right] =:\sqrt{d_G}\sum _{k=1}^r \lambda _k \left ( {\varvec {v}}_k\otimes {\varvec {w}}_k\otimes Z_k\right) , \end{aligned}$$

where \(Z_k\)s are independent \(d_S\times d_T\) Gaussian ensembles. By Lemma 2, we get

$$\begin{aligned} \left\| \sum _{g=1}^{d_G}T_g\otimes E_g\right\| =O_p\left ( \lambda _1\sigma \sqrt{d_G (d_S+d_T)}\right) , \quad \text{as\ }d_G\rightarrow \infty , \end{aligned}$$

where we used the fact that \(r\le \min \{d_S,d_T\}\). Therefore,

$$\begin{aligned} \Vert \varDelta _3\Vert =O_p\left ( \lambda _1\sigma \sqrt{d_S+d_T\over d_G}\right) . \end{aligned}$$

Thus, (12) implies that

$$\begin{aligned} 1-\tau _{m}=O_p\left ( \lambda _k^{-2} (2\sigma ^2+\lambda _1\sigma )\sqrt{d_S+d_T\over d_G}\right) , \end{aligned}$$
(13)

for any large enough m.

It remains to verify condition (9), which we shall do by induction. In the light of Theorem 1 and the assumption on \(\lambda _1\) and \(\lambda _k\), we know that it is satisfied when \(m=0\), as soon as the numerical constant \(C>0\) is taken large enough. Now if \(\tau _{m-1}\) satisfies (9), then (11) holds. We can then deduct that the lower bound given by (9) also holds for \(\tau _{m}\). □

B. Auxiliary Results

We now derive tail bounds necessary for the proof of Theorem 2.

Lemma 1

Let \({\varvec {E}}\in {\mathbb {R}}^{d_1\times d_2\times d_3}\) (\(d_1\ge d_2\ge d_3\)) be a third order tensor whose entries \(e_{i_1i_2i_3}\) (\(1\le i_k\le d_k\)) are independently sampled from the standard normal distribution. Write \(E_i= (e_{i_1i_2i_3})_{1\le i_2\le d_2, 1\le i_3\le d_3}\) its ith (2, 3) slice. Then

$$\begin{aligned} \left\| {1\over d_1}\sum _{i=1}^{d_1} \left\{ E_{i}\otimes E_i-{\mathbb {E}}\left ( E_{i}\otimes E_i\right) \right\} \right\| \le 6\sqrt{d_2+d_3\over d_1} \end{aligned}$$

with probability tending to one as \(d_1\rightarrow \infty\).

Proof

(Proof of Lemma 1) For brevity, denote by

$$\begin{aligned} {\varvec {T}}_i=E_{i}\otimes E_i-{\mathbb {E}}\left ( E_{i}\otimes E_i\right) \end{aligned}$$

and

$$\begin{aligned} {\varvec {T}}={1\over d_1}\sum _{i=1}^{d_1}{\varvec {T}}_i. \end{aligned}$$

Note that \({\varvec {T}}\) is a \(d_2\times d_3\times d_3\times d_2\) tensor obeying

$$\begin{aligned} T (\omega )=T (\pi _{14} (\omega ))=T (\pi _{23} (\omega )), \qquad \forall \omega \in [d_2]\times [d_3]\times [d_3]\times [d_2], \end{aligned}$$

where \(\pi _{k_1k_2}\) permutes the \(k_1\) and \(k_2\) entry of vector. Therefore

$$\begin{aligned} {\varvec {T}}=\sup _{\begin{array}{c} {\varvec {a}}_1,{\varvec {a}}_2\in {\mathbb {R}}^{d_2}, {\varvec {b}}_1,{\varvec {b}}_2\in {\mathbb {R}}^{d_3}\\ \Vert {\varvec {a}}_1\Vert ,\Vert {\varvec {a}}_2\Vert ,\Vert {\varvec {b}}_1\Vert ,\Vert {\varvec {b}}_2\Vert = 1 \end{array}}\langle {\varvec {T}}, {\varvec {a}}_1\otimes {\varvec {b}}_1\otimes {\varvec {b}}_2\otimes {\varvec {a}}_2\rangle =\sup _{\begin{array}{c} {\varvec {a}}\in {\mathbb {R}}^{d_2}, {\varvec {b}}\in {\mathbb {R}}^{d_3}\\ \Vert {\varvec {a}}\Vert ,\Vert {\varvec {b}}\Vert = 1 \end{array}}\langle {\varvec {T}}, {\varvec {a}}\otimes {\varvec {b}}\otimes {\varvec {b}}\otimes {\varvec {a}}\rangle . \end{aligned}$$

Observe that for any \({\varvec {a}}_1,{\varvec {a}}_2\in {{\mathbb {S}}}^{d_2-1}\) and \({\varvec {b}}_1,{\varvec {b}}_2\in {{\mathbb {S}}}^{d_3-1}\),

$$\begin{aligned}&\left| \langle {\varvec {T}}, {\varvec {a}}_1\otimes {\varvec {b}}_1\otimes {\varvec {b}}_1\otimes {\varvec {a}}_1\rangle -\langle {\varvec {T}}, {\varvec {a}}_2\otimes {\varvec {b}}_2\otimes {\varvec {b}}_2\otimes {\varvec {a}}_2\rangle \right| \\&\quad \le \left| \langle {\varvec {T}}, {\varvec {a}}_1\otimes {\varvec {b}}_1\otimes {\varvec {b}}_1\otimes {\varvec {a}}_1\rangle -\langle {\varvec {T}}, {\varvec {a}}_2\otimes {\varvec {b}}_1\otimes {\varvec {b}}_1\otimes {\varvec {a}}_2\rangle \right| \\&\qquad +\left| \langle {\varvec {T}}, {\varvec {a}}_2\otimes {\varvec {b}}_1\otimes {\varvec {b}}_1\otimes {\varvec {a}}_2\rangle -\langle {\varvec {T}}, {\varvec {a}}_2\otimes {\varvec {b}}_2\otimes {\varvec {b}}_2\otimes {\varvec {a}}_2\rangle \right| \\&\quad \le \left| \langle {\varvec {T}}, ({\varvec {a}}_1-{\varvec {a}}_2)\otimes {\varvec {b}}_1\otimes {\varvec {b}}_1\otimes ({\varvec {a}}_1+{\varvec {a}}_2)\rangle \right| \\&\qquad +\left| \langle {\varvec {T}}, {\varvec {a}}_2\otimes ({\varvec {b}}_1-{\varvec {b}}_2)\otimes ({\varvec {b}}_1+{\varvec {b}}_2)\otimes {\varvec {a}}_2\rangle \right| \\&\quad \le 2\Vert {\varvec {T}}\Vert \left ( \Vert {\varvec {a}}_1-{\varvec {a}}_2\Vert +\Vert {\varvec {b}}_1-{\varvec {b}}_2\Vert \right) . \end{aligned}$$

In particular, if \(\Vert {\varvec {a}}_1-{\varvec {a}}_2\Vert ,\Vert {\varvec {b}}_1-{\varvec {b}}_2\Vert \le 1/8\), then

$$\begin{aligned} \left| \langle {\varvec {T}}, {\varvec {a}}_1\otimes {\varvec {b}}_1\otimes {\varvec {b}}_1\otimes {\varvec {a}}_1\rangle -\langle {\varvec {T}}, {\varvec {a}}_2\otimes {\varvec {b}}_2\otimes {\varvec {b}}_2\otimes {\varvec {a}}_2\rangle \right| \le {1\over 2}\Vert {\varvec {T}}\Vert . \end{aligned}$$
(14)

We can find a 1/8 cover set \({\mathcal {N}}_1\) of \({{\mathbb {S}}}^{d_2-1}\) such that \(|{\mathcal {N}}_1|\le 9^{d_2}\). Similarly, let \({\mathcal {N}}_2\) be a 1/8 covering set of \({{\mathbb {S}}}^{d_3-1}\) such that \(|{\mathcal {N}}_2|\le 9^{d_3}\). Then by (14)

$$\begin{aligned} \Vert {\varvec {T}}\Vert \le \sup _{{\varvec {a}}\in {\mathcal {N}}_1,{\varvec {b}}\in {\mathcal {N}}_2}\langle {\varvec {T}}, {\varvec {a}}\otimes {\varvec {b}}\otimes {\varvec {b}}\otimes {\varvec {a}}\rangle +{1\over 2}\Vert {\varvec {T}}\Vert , \end{aligned}$$

suggesting

$$\begin{aligned} \Vert {\varvec {T}}\Vert \le 2\sup _{{\varvec {a}}\in {\mathcal {N}}_1,{\varvec {b}}\in {\mathcal {N}}_2}\langle {\varvec {T}}, {\varvec {a}}\otimes {\varvec {b}}\otimes {\varvec {b}}\otimes {\varvec {a}}\rangle . \end{aligned}$$

Now note that for any \({\varvec {a}}\in {\mathcal {N}}_1\) and \({\varvec {b}}\in {\mathcal {N}}_2\),

$$\begin{aligned} \langle {\varvec {T}}_i, {\varvec {a}}\otimes {\varvec {b}}\otimes {\varvec {b}}\otimes {\varvec {a}}\rangle =\langle E_i,{\varvec {a}}\otimes {\varvec {b}}\rangle ^2-{\mathbb {E}}\langle E_i,{\varvec {a}}\otimes {\varvec {b}}\rangle ^2=\langle E_i,{\varvec {a}}\otimes {\varvec {b}}\rangle ^2-1\sim \chi ^2_1-1. \end{aligned}$$

Therefore

$$\begin{aligned} \langle {\varvec {T}}, {\varvec {a}}\otimes {\varvec {b}}\otimes {\varvec {b}}\otimes {\varvec {a}}\rangle \sim {1\over d_1}\chi ^2_{d_1}-1. \end{aligned}$$

An application of the \(\chi ^2\) tail bound from [17] leads to

$$\begin{aligned} {\mathbb {P}}\left\{ \langle {\varvec {T}}, {\varvec {a}}\otimes {\varvec {b}}\otimes {\varvec {b}}\otimes {\varvec {a}}\rangle \ge x\right\} \le \exp (-d_1x^2/4), \end{aligned}$$

for any \(x<1\). By union bound,

$$\begin{aligned} {\mathbb {P}}\left\{ \sup _{{\varvec {a}}\in {\mathcal {N}}_1,{\varvec {b}}\in {\mathcal {N}}_2}\langle {\varvec {T}}, {\varvec {a}}\otimes {\varvec {b}}\otimes {\varvec {b}}\otimes {\varvec {a}}\rangle \ge x\right\} \le 9^{d_2+d_3}\exp (-d_1x^2/4), \end{aligned}$$

so that

$$\begin{aligned} \Vert {\varvec {T}}\Vert \le 6\sqrt{d_2+d_3\over d_1} \end{aligned}$$

with probability tending to one as \(d_1\rightarrow \infty\). □

Lemma 2

Let \(\{{\varvec {v}}_1,\ldots ,{\varvec {v}}_{d_1}\}\) be an orthonormal basis of \({\mathbb {R}}^{d_1}\), and \(\{{\varvec {w}}_1,\ldots ,{\varvec {w}}_{d_2}\}\) an orthonormal basis of \({\mathbb {R}}^{d_2}\). Let \(Z_1,\ldots , Z_r\) be independent \(d_3\times d_4\) Gaussian random matrix whose entries are independently drawn from the standard normal distribution. Then for any sequence of nonnegative numbers \(\lambda _1,\ldots , \lambda _r\le 1\):

$$\begin{aligned} {\mathbb {P}}\left\{ \left\| \sum _{k=1}^r \lambda _k \left ( {\varvec {v}}_k\otimes {\varvec {w}}_k\otimes Z_k\right) \right\| \ge \sqrt{d_3}+\sqrt{d_4}+\sqrt{2\log r}+t\right\} \le \exp (-t^2/2). \end{aligned}$$

Proof

(Proof of Lemma 2) Observe that

$$\begin{aligned} \left\| \sum _{k=1}^r \lambda _k \left ( {\varvec {v}}_k\otimes {\varvec {w}}_k\otimes Z_k\right) \right\|= & {} \sup _{{\varvec {a}}\in {{\mathbb {S}}}^{d_1-1},{\varvec {b}}\in {{\mathbb {S}}}^{d_2-1}} \left\| \sum _{k=1}^r\lambda _k\langle {\varvec {a}}, {\varvec {v}}_k\rangle \langle {\varvec {b}}, {\varvec {w}}_k\rangle Z_k\right\| \\= & {} \sup _{{\varvec {a}}\in {{\mathbb {S}}}^{r-1},{\varvec {b}}\in {{\mathbb {S}}}^{r-1}} \left\| \sum _{k=1}^r\lambda _ka_kb_k Z_k\right\| \\\le & {} \sup _{{\varvec {a}}\in {{\mathbb {S}}}^{r-1},{\varvec {b}}\in {{\mathbb {S}}}^{r-1}} \sum _{k=1}^r\lambda _ka_kb_k \Vert Z_k\Vert \\\le & {} \left ( \max _{1\le k\le r}\lambda _k\Vert Z_k\Vert \right) \left ( \sup _{{\varvec {a}}\in {{\mathbb {S}}}^{r-1},{\varvec {b}}\in {{\mathbb {S}}}^{r-1}}\sum _{k=1}^ra_kb_k\right) \\\le & {} \max _{1\le k\le r}\Vert Z_k\Vert . \end{aligned}$$

By concentration bounds for Gaussian random matrices,

$$\begin{aligned} {\mathbb {P}}\left\{ \Vert Z_k\Vert \ge \sqrt{d_3}+\sqrt{d_4}+t\right\} \le \exp (-t^2/2). \end{aligned}$$

See, e.g., [34]. □

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, T., Yuan, M. & Zhao, H. Characterizing Spatiotemporal Transcriptome of the Human Brain Via Low-Rank Tensor Decomposition. Stat Biosci 14, 485–513 (2022). https://doi.org/10.1007/s12561-021-09331-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12561-021-09331-5

Keywords

Navigation