Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Simultaneous Component Analysis by Means of Tucker3

Abstract

A new model for simultaneous component analysis (SCA) is introduced that contains the existing SCA models with common loading matrix as special cases. The new SCA-T3 model is a multi-set generalization of the Tucker3 model for component analysis of three-way data. For each mode (observational units, variables, sets) a different number of components can be chosen and the obtained solution can be rotated without loss of fit to facilitate interpretation. SCA-T3 can be fitted on centered multi-set data and also on the corresponding covariance matrices. For this purpose, alternating least squares algorithms are derived. SCA-T3 is evaluated in a simulation study, and its practical merits are demonstrated for several benchmark datasets.

This is a preview of subscription content, log in to check access.

Fig. 1

References

  1. Acar, E., & Yener, B. (2009). Unsupervised multiway data analysis: A literature survey. IEEE Transactions on Knowledge and Data Engineering, 21, 1–15.

  2. Carroll, J. D., & Chang, J. J. (1970). Analysis of individual differences in multidimensional scaling via an \(n\)-way generalization of Eckart-Young decomposition. Psychometrika, 35, 283–319.

  3. Ceulemans, E., & Kiers, H. A. L. (2006). Selecting among three-mode principal component models of different types and complexities: a numerical convex hull based method. British Journal of Mathematical and Statistical Psychology, 59, 133–150.

  4. Ceulemans, E., Timmerman, M. E., & Kiers, H. A. L. (2011). The CHull procedure for selecting among multilevel component solutions. Chemometrics and Intelligent Laboratory Systems, 106, 12–20.

  5. Comon, P., & De Lathauwer, L. (2010). Algebraic identification of under-determined mixtures. In P. Comon & C. Jutten (Eds.), Handbook of blind source separation: Independent component analysis and applications (pp. 325–366). Cambridge: Academic Press.

  6. De Lathauwer, L., De Moor, B., & Vandewalle, J. (2000a). A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 21, 1253–1278.

  7. De Lathauwer, L., De Moor, B., & Vandewalle, J. (2000b). On the best rank-1 and rank-\((R_1, R_2,\ldots, R_N)\) approximation of higher-order tensors. SIAM Journal on Matrix Analysis and Applications, 21, 1324–1342.

  8. De Lathauwer, L. (2010). Algebraic methods after prewhitening. In P. Comon & C. Jutten (Eds.), Handbook of blind source separation: Independent component analysis and applications (pp. 155–178). Cambridge: Academic Press.

  9. De Roover, K., Ceulemans, E., Timmerman, M. E., Vansteelandt, K., Stouten, J., & Onghena, P. (2012). Clusterwise simultaneous component analysis for analyzing structural differences in multivariate multiblock data. Psychological Methods, 17, 100–119.

  10. De Roover, K., Ceulemans, E., Timmerman, M. E., Nezlek, J. B., & Onghena, P. (2013). Modeling differences in the dimensionality of multiblock data by means of clusterwise simultaneous component analysis. Psychometrika, 78, 648–668.

  11. De Roover, K., Ceulemans, E., Timmerman, M. E., & Onghena, P. (2013). A clusterwise simultaneous component method for capturing within-cluster differences in component variances and correlations. British Journal of Mathematical and Statistical Psychology, 66, 81–102.

  12. De Silva, V., & Lim, L.-H. (2008). Tensor rank and the ill-posedness of the best low-rank approximation problem. SIAM Journal on Matrix Analysis and Applications, 30, 1084–1127.

  13. Domanov, I., & De Lathauwer, L. (2013). On the uniqueness of the canonical polyadic decomposition of third-order tensors—Part II: uniqueness of the overall decomposition. SIAM Journal on Matrix Analysis and Applications, 34, 876–903.

  14. Eckart, C., & Young, G. (1936). The approximation of one matrix by another of lower rank. Psychometrika, 1, 211–218.

  15. Golub, G. H., & Van Loan, C. F. (1996). Matrix computations (3rd ed.). Baltimore: John Hopkins University Press.

  16. Harshman, R.A. (1970). Foundations of the Parafac procedure: Models and conditions for an “explanatory” multimodal factor analysis. UCLA Working papers in Phonetics (vol. 16, pp. 1–84).

  17. Harshman, R.A. (1972). Parafac2: Mathematical and technical notes. UCLA Working Papers in Phonetics (vol. 22, pp. 30–44).

  18. Helwig, N. E. (2013). The special sign indeterminacy of the direct-fitting Parafac2 model: Some implications, cautions, and recommendations for simultaneous component analysis. Psychometrika, 78, 725–739.

  19. Ishteva, M., Absil, P.-A., Van Huffel, S., & De Lathauwer, L. (2011). Best low multilinear rank approximation of higher-order tensors, based on the Riemannian trust-region scheme. SIAM Journal on Matrix Analysis and Applications, 32, 115–135.

  20. Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409–426.

  21. Kiers, H. A. L. (1993). An alternating least squares algorithm for PARAFAC2 and three-way Dedicom. Computational Statistics and Data Analysis, 16, 103–118.

  22. Kiers, H. A. L., & Ten Berge, J. M. F. (1994). Hierarchical relations between methods for simultaneous component analysis and a technique for rotation to a simple simultaneous structure. British Journal of Mathematical and Statistical Psychology, 47, 109–126.

  23. Kiers, H. A. L. (1998a). Three-way SIMPLIMAX for oblique rotation of the three-mode factor analysis core to simple structure. Computational Statistics and Data Analysis, 28, 307–324.

  24. Kiers, H. A. L. (1998b). Joint orthomax rotation of the core and component matrices resulting from three-mode principal components analysis. Journal of Classification, 15, 245–263.

  25. Kiers, H. A. L., & Smilde, A. K. (1998). Constrained three-mode factor analysis as a tool for parameter estimation with second-order instrumental data. Journal of Chemometrics, 12, 125–147.

  26. Kiers, H. A. L., Ten Berge, J. M. F., & Bro, R. (1999). Parafac2—Part I. A direct fitting algorithm for the Parafac2 model. Journal of Chemometrics, 13, 275–294.

  27. Kiers, H. A. L. (2004). Bootstrap confidence intervals for three-way methods. Journal of Chemometrics, 18, 22–36.

  28. Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51, 455–500.

  29. Kroonenberg, P. M., & De Leeuw, J. (1980). Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika, 45, 69–97.

  30. Kroonenberg, P.M. (2008). Applied multiway data analysis, Wiley series in probability and statistics. John Wiley & Sons, Hoboken, New Jersey.

  31. Lam, T.T.T. (2015). Some new methods for three-mode factor analysis and multi-set factor analysis. Ph.D. Thesis. University of Groningen, The Netherlands.

  32. Louwerse, D. J., Smilde, A. K., & Kiers, H. A. L. (1999). Cross-validation of multiway component models. Journal of Chemometrics, 13, 491–510.

  33. McGaw, B., & Jöreskog, K. G. (1971). Factorial invariance of ability measures in groups differing in intelligence and socioeconomic status. British Journal of Mathematical and Statistical Psychology, 24, 154–168.

  34. Penrose, R. (1956). On the best approximate solutions of linear matrix equations. Mathematical Proceedings of the Cambridge Philosophical Society, 52, 17–19.

  35. Rocci, R. (1992). Three-mode factor analysis with binary core and orthonormality constraints. Journal of the Italian Statistical Society, 1, 413–422.

  36. Savas, B., & Lim, L.-H. (2010). Quasi-Newton methods on Grassmannians and multilinear approximation of tensors. SIAM Journal on Scientific Computing, 32, 3352–3393.

  37. Shifren, K., Hooker, K., Wood, P., & Nesselroade, J. R. (1997). Structure and variation of mood in individuals with Parkinson’s disease: A dynamic factor analysis. Psychology and Aging, 12, 328–339.

  38. Smilde, A., Bro, R., & Geladi, P. (2004). Multi-way analysis: Applications in the chemical sciences. Chichester: Wiley.

  39. Stegeman, A. (2006). Degeneracy in Candecomp/Parafac explained for \(p\times p\times 2\) arrays of rank \(p+1\) or higher. Psychometrika, 71, 483–501.

  40. Stegeman, A. (2014). Finding the limit of diverging components in three-way Candecomp/Parafac–A demonstration of its practical merits. Computational Statistics and Data Analysis, 75, 203–216.

  41. Stegeman, A., & Lam, T. T. T. (2014). Three-mode factor analysis by means of Candecomp/Parafac. Psychometrika, 79, 426–443.

  42. Stegeman, A., & Lam, T. T. T. (2016). Multi-set factor analysis by means of Parafac2. British Journal of Mathematical and Statistical Psychology, 69, 1–19.

  43. Ten Berge, J. M. F., & Kiers, H. A. L. (1991). A numerical approach to the approximate and the exact minimum rank of a covariance matrix. Psychometrika, 56, 309–315.

  44. Ten Berge, J. M. F., & Kiers, H. A. L. (1996). Some uniqueness results for Parafac2. Psychometrika, 61, 123–132.

  45. Ten Berge, J. M. F., & Smilde, A. K. (2002). Non-triviality and identification of a constrained Tucker3 analysis. Journal of Chemometrics, 16, 609–612.

  46. Timmerman, M., & Kiers, H. A. L. (2003). Four simultaneous component models for the analysis of multivariate time series from more than one subject to model intraindividual and interindividual differences. Psychometrika, 68, 105–121.

  47. Tomasi, G., & Bro, R. (2006). A Comparison of algorithms for fitting the Parafac model. Computational Statistics and Data Analysis, 50, 1700–1734.

  48. Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31, 279–311.

  49. Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54, 1063–1070.

Download references

Acknowledgements

The author would like to thank Kim Shifren (Department of Psychology, Towson University) for permission to use the PANAS dataset, and Henk Kiers (Department of Psychometrics and Statistics, University of Groningen) for suggestions on how to rotate a Tucker3 solution to another one. Research supported by: (1) Research Council KU Leuven: C1 project c16/15/059-nD; (2) the Belgian Federal Science Policy Office: IUAP P7 (DYSCO II, Dynamical systems, control and optimization, 2012–2017).

Author information

Correspondence to Alwin Stegeman.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 119 KB)

Appendix: Proofs of Lemma 3.1 and Lemma 3.2

Appendix: Proofs of Lemma 3.1 and Lemma 3.2

Proof of Lemma 3.1

We start by proving equality (9) for SCA-P fitted via PCA as described at the beginning of Sect. 3.2. From the latter it can be verified that \(\mathbf{F}_k=\mathbf{X}_k\,\mathbf{B}\,(\mathbf{B}^T\mathbf{B})^{-1}\). This implies that

$$\begin{aligned} \Vert \mathbf{X}_k-\mathbf{F}_k\,\mathbf{B}^T\Vert ^2= & {} \mathrm{trace}(\mathbf{X}_k^T\mathbf{X}_k) - \mathrm{trace}(\mathbf{B}\,(\mathbf{B}^T\mathbf{B})^{-1}{} \mathbf{B}^T\mathbf{X}_k^T\mathbf{X}_k) \\= & {} \mathrm{trace}(\mathbf{X}_k^T\mathbf{X}_k) - \mathrm{trace}(\mathbf{B}\,(\mathbf{B}^T\mathbf{B})^{-1}{} \mathbf{B}^T\mathbf{X}_k^T\mathbf{X}_k\mathbf{B}\,(\mathbf{B}^T\mathbf{B})^{-1}{} \mathbf{B}^T) \\= & {} \Vert \mathbf{X}_k\Vert ^2 - \Vert \mathbf{F}_k\,\mathbf{B}^T\Vert ^2, \end{aligned}$$

which is equivalent to (9).

Next, we consider SCA-PF2. It suffices to show that trace\((\mathbf{X}_k^T\mathbf{F}_k\,\mathbf{B}^T)=\) trace\((\mathbf{B}\,\mathbf{F}_k^T\mathbf{F}_k\,\mathbf{B}^T)\). In SCA-PF2 we can set \(\mathbf{F}_k=\widetilde{\mathbf{A}}_k\,\mathbf{H}\,\widetilde{\mathbf{C}}_k\), with \(\widetilde{\mathbf{A}}_k^T\widetilde{\mathbf{A}}_k=\mathbf{I}_R\), \(\mathbf{H}^T\mathbf{H}=\widetilde{\mathbf{\Phi }}\), and \(\widetilde{\mathbf{C}}_k\) diagonal \(R\times R\) (Kiers et al., 1999; Timmerman & Kiers, 2003). In step 2 of the ALS algorithm for SCA-PF2 the objective function is minimized over \(\mathbf{H}\), \(\widetilde{\mathbf{C}}_k\), and \(\mathbf{B}\) for fixed \(\widetilde{\mathbf{A}}_k\). Analogous to Sect. 3.1, this boils down to fitting Parafac to the \(R\times J\times K\) array with slices \(\widetilde{\mathbf{A}}_k^T\mathbf{X}_k\), \(k=1,\ldots ,K\). One iteration of the Parafac ALS algorithm is used to approximate \(\widetilde{\mathbf{A}}_k^T\mathbf{X}_k\approx \mathbf{H}\,\widetilde{\mathbf{C}}_k\,\mathbf{B}^T\). Vectorizing on both sides yields \(\mathrm{Vec}(\widetilde{\mathbf{A}}_k^T\mathbf{X}_k)\approx (\mathbf{B}\odot \mathbf{H})\,\tilde{\mathbf{c}}_k^\mathrm{(row)}\), with \(\tilde{\mathbf{c}}_k^\mathrm{(row)}\) denoting the kth row of \(\widetilde{\mathbf{C}}\) as a column vector. The OLS regression update of \(\tilde{\mathbf{c}}_k^\mathrm{(row)}\) for fixed \(\mathbf{H}\) and \(\mathbf{B}\) implies that

$$\begin{aligned} \mathrm{trace}(\mathrm{Vec}(\widetilde{\mathbf{A}}_k^T\mathbf{X}_k)^T(\mathbf{B}\odot \mathbf{H})\,\tilde{\mathbf{c}}_k^\mathrm{(row)})= \mathrm{trace}((\tilde{\mathbf{c}}_k^\mathrm{(row)})^T(\mathbf{B}\odot \mathbf{H})^T(\mathbf{B}\odot \mathbf{H})\,\tilde{\mathbf{c}}_k^\mathrm{(row)}), \end{aligned}$$

which is equivalent to trace\((\mathbf{X}_k^T\widetilde{\mathbf{A}}_k\,\mathbf{H}\,\widetilde{\mathbf{C}}_k\,\mathbf{B}^T)=\Vert \widetilde{\mathbf{A}}_k\,\mathbf{H}\,\widetilde{\mathbf{C}}_k\,\mathbf{B}^T\Vert ^2\). This is the desired result. The proof for SCA-IND follows from the above by setting \(\widetilde{\mathbf{\Phi }}=\mathbf{I}_R\) (Timmerman & Kiers, 2003).

Finally, we consider SCA-T3 fitted by the ALS algorithm in Sect.  3.1. The proof is analogous to the proof for SCA-PF2. We write \(\mathbf{F}_k=\widetilde{\mathbf{A}}_k\left( \sum _r \tilde{c}_{kr}\,\mathbf{G}_r\right) \), with \(\widetilde{\mathbf{A}}_k^T\widetilde{\mathbf{A}}_k=\mathbf{I}_P\). In step 2 of the ALS algorithm in Sect. 3.1 we fit Tucker2 to the \(P\times J\times K\) array with slices \(\widetilde{\mathbf{A}}_k^T\mathbf{X}_k\), \(k=1,\ldots ,K\). One iteration of the Tucker2 ALS algorithm is used to approximate \(\widetilde{\mathbf{A}}_k^T\mathbf{X}_k\approx \left( \sum _r \tilde{c}_{kr}\,\mathbf{G}_r\right) \mathbf{B}^T\). Vectorizing on both sides yields \(\mathrm{Vec}(\widetilde{\mathbf{A}}_k^T\mathbf{X}_k)\approx (\mathbf{B}\otimes \mathbf{I}_P)\,[\mathrm{Vec}(\mathbf{G}_1)\;\ldots \;\mathrm{Vec}(\mathbf{G}_R)]\,\tilde{\mathbf{c}}_k^\mathrm{(row)}\). As above, the OLS regression update of \(\tilde{\mathbf{c}}_k^\mathrm{(row)}\) for fixed \(\mathbf{B}\) and \(\mathcal{G}\) yields the result. This completes the proof. \(\square \)

Generally, the equality in (9) follows from an OLS regression update specifically for sample k. For SCA-ECP there is no such update and the equality does not hold. Indeed, fitting SCA-ECP to the PANAS dataset in Sect. 5.1 yields inequality in (9).

Proof of Lemma 3.2

After step 2 of the ALS algorithm for SCA-T3 (Sect. 3.1) the fitted model for \(\mathbf{X}_k\) can be written as \(\widetilde{\mathbf{A}}_k\,\mathbf{G}\,(\tilde{\mathbf{c}}_k^\mathrm{(row)}\otimes \mathbf{B}^T)\), with \(\mathbf{G}=[\mathbf{G}_1\;\ldots \;\mathbf{G}_R]\) and \(\tilde{\mathbf{c}}_k^\mathrm{(row)}\) the kth row of \(\widetilde{\mathbf{C}}\) as a column vector. Analogous to (4) for Tucker3 (Sect. 2.1), we obtain

$$\begin{aligned} \mathrm{Vec}(\mathbf{X}_\mathrm{all})\approx \left[ \begin{array}{c} (\tilde{\mathbf{c}}_1^\mathrm{(row)})^T\otimes \mathbf{B}\otimes \widetilde{\mathbf{A}}_1 \\ \vdots \\ (\tilde{\mathbf{c}}_K^\mathrm{(row)})^T\otimes \mathbf{B}\otimes \widetilde{\mathbf{A}}_K \end{array}\right] \mathrm{Vec}(\mathbf{G}). \end{aligned}$$
(18)

The matrix on the right-hand side is columnwise orthonormal, since

$$\begin{aligned} \sum _{k=1}^K ((\tilde{\mathbf{c}}_k^\mathrm{(row)})^T\otimes \mathbf{B}\otimes \widetilde{\mathbf{A}}_k)^T((\tilde{\mathbf{c}}_k^\mathrm{(row)})^T\otimes \mathbf{B}\otimes \widetilde{\mathbf{A}}_k)= & {} \sum _{k=1}^K (\tilde{\mathbf{c}}_k^\mathrm{(row)}(\tilde{\mathbf{c}}_k^\mathrm{(row)})^T\otimes \mathbf{B}^T\mathbf{B}\otimes \widetilde{\mathbf{A}}_k^T\widetilde{\mathbf{A}}_k) \nonumber \\= & {} \left( \sum _{k=1}^K \tilde{\mathbf{c}}_k^\mathrm{(row)}(\tilde{\mathbf{c}}_k^\mathrm{(row)})^T\right) \otimes \mathbf{I}_Q\otimes \mathbf{I}_P \nonumber \\= & {} \widetilde{\mathbf{C}}^T\widetilde{\mathbf{C}}\otimes \mathbf{I}_Q\otimes \mathbf{I}_P \nonumber \\= & {} \mathbf{I}_R\otimes \mathbf{I}_Q\otimes \mathbf{I}_P=\mathbf{I}_{PQR}. \end{aligned}$$
(19)

The update of \(\mathcal{G}\) in step 2 of the ALS algorithm for SCA-T3 in Sect. 3.1 is computed via OLS regression in (18). Indeed, this is identical to the update of \(\mathcal{G}\) in the ALS algorithm for Tucker2 fitted to the \(P\times J\times K\) array with slices \(\widetilde{\mathbf{A}}_k^T\mathbf{X}_k\), \(k=1,\ldots ,K\). After convergence of the ALS algorithm for SCA-T3 we set \(\mathbf{A}_k=N_k^{1/2}\widetilde{\mathbf{A}}_k\) and \(\mathbf{c}_k^\mathrm{(row)}=N_k^{-1/2}\,\tilde{\mathbf{c}}_k^\mathrm{(row)}\), \(k=1,\ldots ,K\). For the blocks of the matrix in (18), we have \((\mathbf{c}_k^\mathrm{(row)})^T\otimes \mathbf{B}\otimes \mathbf{A}_k=(\tilde{\mathbf{c}}_k^\mathrm{(row)})^T\otimes \mathbf{B}\otimes \widetilde{\mathbf{A}}_k\). Since the update of \(\mathbf{G}\) in (19) is done via OLS regression with orthonormal predictors, it follows that for SCA-T3 we can compute the fit percentage due to term (pqr) as in (11). Moreover, these terms sum up to the total fit percentage (8). This is analogous to Tucker3 (Sect. 2.1). This completes the proof. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Stegeman, A. Simultaneous Component Analysis by Means of Tucker3. Psychometrika 83, 21–47 (2018). https://doi.org/10.1007/s11336-017-9568-7

Download citation

Keywords

  • simultaneous components analysis
  • multi-set data
  • tucker
  • parafac
  • rotation