Abstract
CP decomposition of large third-order tensors can be computationally challenging. Parameters are typically estimated by means of the ALS procedure because it yields least-squares solutions and provides consistent outcomes. Nevertheless, ALS presents two major flaws which are particularly problematic for large-scale problems: slow convergence and sensitiveness to degeneracy conditions such as over-factoring, collinearity, bad initialization and local minima. More efficient algorithms have been proposed in the literature. They are, however, much less dependable than ALS in delivering stable results because the increased speed often comes at the expense of accuracy. In particular, the ATLD procedure is one of the fastest alternatives, but it is hardly employed because of the unreliable nature of its convergence. As a solution, multi-optimization is proposed. ATLD and ALS steps are concatenated in an integrated procedure with the purpose of increasing efficiency without a significant loss in precision. This methodology has been implemented and tested under realistic conditions on simulated data sets.
Similar content being viewed by others
References
Andersson CA, Bro R (1998) Improving the speed of multi-way algorithms:: part I. Tucker3. Chemom Intell Lab Syst 42(1):93–103
Beh EJ, Lombardo R (2014) Correspondence analysis: theory, practice and new strategies. Wiley, Hoboken
Bro R (1998) Multi-way analysis in the food industry. Models, algorithms, and applications. Academish proefschrift, Dinamarca
Bro R, Andersson CA (1998) Improving the speed of multiway algorithms: part II: compression. Chemom Intell Lab Syst 42(1–2):105–113
Bro R, Kiers HA (2003) A new efficient method for determining the number of components in parafac models. J Chemom 17(5):274–286
Carlier A, Kroonenberg PM (1996) Decompositions and biplots in three-way correspondence analysis. Psychometrika 61(2):355–373
Carroll JD, Chang JJ (1970) Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart–Young” decomposition. Psychometrika 35(3):283–319
Cattell RB (1944) “Parallel proportional profiles” and other principles for determining the choice of factors by rotation. Psychometrika 9(4):267–283
Ceulemans E, Kiers HA (2006) Selecting among three-mode principal component models of different types and complexities: a numerical convex hull based method. Br J Math Stat Psychol 59(1):133–150
Chen ZP, Wu HL, Jiang JH, Li Y, Yu RQ (2000) A novel trilinear decomposition algorithm for second-order linear calibration. Chemom Intell Lab Syst 52(1):75–86
Chen ZP, Liu Z, Cao YZ, Yu RQ (2001) Efficient way to estimate the optimum number of factors for trilinear decomposition. Anal Chim Acta 444(2):295–307
Core Team R (2012) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Di Palma M, Filzmoser P, Gallo M, Hron K (2017) A robust Parafac model for compositional data. J Appl Stat. https://doi.org/10.1080/02664763.2017.1381669
Domanov I, De Lathauwer L (2013a) On the uniqueness of the canonical polyadic decomposition of third-order tensors—part I: basic results and uniqueness of one factor matrix. SIAM J Matrix Anal Appl 34(3):855–875
Domanov I, De Lathauwer L (2013b) On the uniqueness of the canonical polyadic decomposition of third-order tensors—part II: uniqueness of the overall decomposition. SIAM J Matrix Anal Appl 34(3):876–903
Domanov I, De Lathauwer L (2017) Canonical polyadic decomposition of third-order tensors: relaxed uniqueness conditions and algebraic algorithm. Linear Algebra Appl 513:342–375
Engelen S, Hubert M (2011) Detecting outlying samples in a parallel factor analysis model. Anal Chim Acta 705(1–2):155–165
Faber NKM, Bro R, Hopke PK (2003) Recent developments in CANDECOMP/PARAFAC algorithms: a critical review. Chemom Intell Lab Syst 65(1):119–137
Gallo M, Simonacci V, Di Palma MA (2018) An integrated algorithm for three-way compositional data. Qual Quant 53(5):2353–2370
Giordani P, Kiers HA, Del Ferraro MA (2014) Three-way component analysis using the r package threeway. J Stat Softw 57(7):1–23
Harshman RA (1970) Foundations of the PARAFAC procedure: Models and conditions for an “explantory” multi-modal factor analysis. UCLA working papers in phonetics, no 16, pp 1–84
Helwig NE (2017) Multiway: component models for multi-way data. R package version 1.0-3
Hitchcock FL (1927) The expression of a tensor or a polyadic as a sum of products. J Math Phys 6(1–4):164–189
Hitchcock FL (1928) Multiple invariants and generalized rank of a p-way matrix or tensor. J Math Phys 7(1–4):39–79
Kiers HA (1998) A three-step algorithm for CANDECOMP/PARAFAC analysis of large data sets with multicollinearity. J Chemom 12(3):155–171
Kiers HA (2000) Towards a standardized notation and terminology in multiway analysis. J Chemom 14(3):105–122
Kiers HA, Harshman RA (1997) Relating two proposed methods for speedup of algorithms for fitting two-and three-way principal component and related multilinear models. Chemom Intell Lab Syst 36(1):31–40
Kruskal JB (1989) Rank, decomposition, and uniqueness for 3-way and N-way arrays. In: Coppi R, Bolasco S (eds) Multiway data analysis. North-Holland Publishing Co., Amsterdam, pp 7–18
Leurgans S, Ross RT (1992) Multilinear models: applications in spectroscopy. Stat Sci 7(3):289–310
Lorenzo-Seva U, Ten Berge JM (2006) Tucker’s congruence coefficient as a meaningful index of factor similarity. Methodology 2(2):57–64
Mitchell BC, Burdick DS (1993) An empirical comparison of resolution methods for three-way arrays. Chemom Intell Lab Syst 20(2):149–161
Mitchell BC, Burdick DS (1994) Slowly converging parafac sequences: swamps and two-factor degeneracies. J Chemom 8(2):155–168
Phan AH, Cichocki A (2011) PARAFAC algorithms for large-scale problems. Neurocomputing 74(11):1970–1984
Rajih M, Comon P, Harshman RA (2008) Enhanced line search: a novel method to accelerate PARAFAC. SIAM J Matrix Anal Appl 30(3):1128–1147
Sidiropoulos ND, Bro R (2000) On the uniqueness of multilinear decomposition of N-way arrays. J Chemom 14(3):229–239
Simonacci V, Gallo M (2019) Improving PARAFAC-ALS estimates with a double optimization procedure. Chemometr Intell Lab Syst. https://doi.org/10.1016/j.chemolab.2019.103822
ten Berge JM, Sidiropoulos ND (2002) On uniqueness in CANDECOMP/PARAFAC. Psychometrika 67(3):399–409
Timmerman ME, Kiers HA (2000) Three-mode principal components analysis: choosing the numbers of components and sensitivity to local optima. Br J Math Stat Psychol 53(1):1–16
Tomasi G, Bro R (2006) A comparison of algorithms for fitting the PARAFAC model. Comput Stat Data Anal 50(7):1700–1734
Tucker LR (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31(3):279–311
Wu HL, Shibukawa M, Oguma K (1998) An alternating trilinear decomposition algorithm with application to calibration of HPLC-DAD for simultaneous determination of overlapped chlorinated aromatic hydrocarbons. J Chemom 12(1):1–26
Xia AL, Wu HL, Fang DM, Ding YJ, Hu LQ, Yu RQ (2005) Alternating penalty trilinear decomposition algorithm for second-order calibration with application to interference-free analysis of excitation-emission matrix fluorescence data. J Chemom 19(2):65–76
Xia AL, Wu HL, Zhang Y, Zhu SH, Han QJ, Yu RQ (2007) A novel efficient way to estimate the chemical rank of high-way data arrays. Anal Chim Acta 598(1):1–11
Yu YJ, Wu HL, Nie JF, Zhang SR, Li SF, Li YN, Zhu SH, Yu RQ (2011) A comparison of several trilinear second-order calibration algorithms. Chemom Intell Lab Syst 106(1):93–107
Yu YJ, Wu HL, Kang C, Wang Y, Zhao J, Li YN, Liu YJ, Yu RQ (2012) Algorithm combination strategy to obtain the second-order advantage: simultaneous determination of target analytes in plasma using three-dimensional fluorescence spectroscopy. J Chemom 26(5):197–208
Zhang SR, Wu HL, Yu RQ (2015) A study on the differential strategy of some iterative trilinear decomposition algorithms: PARAFAC-ALS, ATLD, SWATLD, and APTLD. J Chemom 29(3):179–192
Zijlstra BJ, Kiers HA (2002) Degenerate solutions obtained from several variants of factor analysis. J Chemometr Soc 16(11):596–605
Acknowledgements
The work of both authors was completed with the support of the University of Naples—“L’Orientale”; no dedicated funding was assigned to this project.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The first author declares that she has no conflict of interest; the second author declares that he has no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by M. Squillante.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
In all simulations carried out in this manuscript, synthetic data are generated in the same fashion. The rank R and the dimensions of the artificial tensor I, J and K are set to preference; then, the loading matrices \(\mathbf{A } \in {\mathbb {R}}^{ I \times R}\), \(\mathbf{B } \in {\mathbb {R}}^{ J \times R}\) and \(\mathbf{C } \in {\mathbb {R}}^{ K\times R}\) are generated from a uniform distribution.
Successively, a set value is assigned to the congruence among factors of each loading matrix (CONG). This is achieved by first orthogonalizing the random matrices by means of the QR method and then replacing the upper triangular matrix with the output of a Cholesky decomposition of the square matrix \(( R \times R)\) with 1s on the diagonal and the parameter CONG everywhere else. A pure tensor is thus reconstructed using Eq. 1\(\tilde{\mathscr {T}}^{I,J,K}\); then, noise contamination is added.
Specifically, two tensors \(\mathscr {E}_{\mathrm{HO}}\) and \(\mathscr {E}_{\mathrm{HE}}\), containing homoscedastic and heteroscedastic residuals are summed to the noise-free reconstructed data. These error tensors are generated as normally distributed values; however, in the case of \(\mathscr {E}_{HE}\), the random values are multiplied by the elements of the pure tensor to ensure different weights.
The percentage of noise contamination HO and HE is specified in terms of proportion of the total variation of \(\tilde{\mathscr {T}}^{I,J,K}\), by first imposing a Frobenius norm of \(\sum _{k=1}^{K}\tilde{{\mathbf {T}}}_{::k}^{2}\) and then multiplying the array by an appropriate scalar:
All calculations were carried out using R language (R Core Team 2012), version 3.5.0, processor 2,3 GHz Intel Core i7, and all procedure were written using base functions and the ThreeWay (Giordani et al. 2014) and multiway (Helwig 2017) packages.
Rights and permissions
About this article
Cite this article
Simonacci, V., Gallo, M. An ATLD–ALS method for the trilinear decomposition of large third-order tensors. Soft Comput 24, 13535–13546 (2020). https://doi.org/10.1007/s00500-019-04320-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04320-9