Skip to main content
Log in

Regularized Generalized Canonical Correlation Analysis: A Framework for Sequential Multiblock Component Methods

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

A new framework for sequential multiblock component methods is presented. This framework relies on a new version of regularized generalized canonical correlation analysis (RGCCA) where various scheme functions and shrinkage constants are considered. Two types of between block connections are considered: blocks are either fully connected or connected to the superblock (concatenation of all blocks). The proposed iterative algorithm is monotone convergent and guarantees obtaining at convergence a stationary point of RGCCA. In some cases, the solution of RGCCA is the first eigenvalue/eigenvector of a certain matrix. For the scheme functions x, \({\vert }x{\vert }\), \(x^{2}\) or \(x^{4}\) and shrinkage constants 0 or 1, many multiblock component methods are recovered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Addinsoft (2016). XLSTAT software, Paris.

  • Carroll, J. D. (1968a). A generalization of canonical correlation analysis to three or more sets of variables. Proceedings of the 76th Convention - American Psychological Association, pp. 227–228.

  • Carroll, J. D. (1968b). Equations and Tables for a generalization of canonical correlation analysis to three or more sets of variables. Unpublished companion paper to Carroll J.D.

  • Chessel, D., & Hanafi, M. (1996). Analyses de la co-inertie de \(K\) nuages de points. Revue de Statistique Appliquée, 44, 35–60.

    Google Scholar 

  • Dahl, T., & Næs, T. (2006). A bridge between Tucker-1 and Carroll’s generalized canonical analysis. Computational Statistics and Data Analysis, 50, 3086–3098.

    Article  Google Scholar 

  • Dijkstra T. K. (1981). Latent variables in linear stochastic models, PhD thesis. Amsterdam: Sociometric Research Foundation.

  • Dijkstra, T. K. (1983). Some comments on maximum likelihood and partial least squares methods. Journal of Economics, 22, 67–90.

    Article  Google Scholar 

  • Dijkstra, T. K., & Henseler, J. (2015). Consistent and asymptotically normal PLS estimators for linear structural equations. Computational Statistics and Data Analysis, 81, 10–23.

    Article  Google Scholar 

  • Escofier, B., & Pagès, J. (1994). Multiple factor analysis, (AFMULT package). Computational Statistics and Data Analysis, 18, 121–140.

    Article  Google Scholar 

  • Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272–299.

    Article  Google Scholar 

  • Fessler J. (2004). Monotone convergence. Lecture notes. https://web.eecs.umich.edu/~fessler/course/600/l/lmono.pdf.

  • Hair, J. F., Hult, G. T. M., Ringle, C. M., & Sarstedt, M. (2014). A primer on partial least squares structural equation modeling (PLS-SEM). Thousand Oaks, CA: SAGE.

    Google Scholar 

  • Hanafi, M. (2007). PLS path modelling: Computation of latent variables with the estimation mode B. Computational Statistics, 22, 275–292.

    Article  Google Scholar 

  • Hanafi, M., & Kiers, H. A. L. (2006). Analysis of \(K\) sets of data, with differential emphasis on agreement between and within sets. Computational Statistics and Data Analysis, 51, 1491–1508.

    Article  Google Scholar 

  • Hanafi, M., Kohler, A., & Qannari, E. M. (2010). Shedding new light on hierarchical principal component analysis. Journal of Chemometrics, 24, 703–709.

    Article  Google Scholar 

  • Hanafi, M., Kohler, A., & Qannari, E. M. (2011). Connections between multiple co-inertia analysis and consensus principal component analysis. Chemometrics and Intelligent Laboratory Systems, 106, 37–40.

    Article  Google Scholar 

  • Hassani, S., Hanafi, M., Qannari, E. M., & Kohler, A. (2013). Deflation strategies for multi-block principal component analysis revisited. Chemometrics and Intelligent Laboratory Systems, 120, 154–168.

    Article  Google Scholar 

  • Horst, P. (1961a). Relations among \(m\) sets of measures. Psychometrika, 26, 126–149.

    Article  Google Scholar 

  • Horst, P. (1961b). Generalized canonical correlations and their applications to experimental data. Journal of Clinical Psychology (Monograph supplement), 14, 331–347.

    Article  Google Scholar 

  • Horst, P. (1965). Factor analysis of data matrices. New York: Holt, Rinehart and Winston.

    Google Scholar 

  • Hotelling, H. (1936). Relations between two sets of variates. Biometrika, 28, 321–377.

    Article  Google Scholar 

  • Hwang, H., & Takane, Y. (2014). Generalized structured component analysis: A component-based approach to structural equation modeling. Boca Raton: CRC Press.

    Google Scholar 

  • Jöreskog, K. G., & Wold, H. (1982). The ML and PLS techniques for modeling with latent variables, historical and comparative aspects. In K. G. Jöreskog & H. Wold (Eds.), Systems under indirect observation, Part 1 (pp. 263–270). Amsterdam: North-Holland.

    Google Scholar 

  • Journée, M., Nesterov, Y., Richtárik, P., & Sepulchre, R. (2010). Generalized power method for sparse principal component analysis. The Journal of Machine Learning Research, 11, 517–553.

    Google Scholar 

  • Kettenring J. R. (1969). Canonical analysis of several sets of variables. Unpublished Ph. D. thesis, Institute of Statistics Mimeo Series No. 615, University of North Carolina at Chapel Hill.

  • Kettenring, J. R. (1971). Canonical analysis of several sets of variables. Biometrika, 58, 433–451.

    Article  Google Scholar 

  • Krämer, N. (2007). Analysis of high-dimensional data with partial least squares and boosting. Doctoral dissertation. Technischen Universität Berlin.

  • Ledoit, O., & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88, 365–411.

    Article  Google Scholar 

  • Lohmöller, J.-B. (1989). Latent variables path modeling with partial least squares. Heildelberg: Springer (reprinted 2013).

  • McDonald, R. P. (1968). A unified treatment of the weighting problem. Psychometrika, 33, 351–381.

    Article  PubMed  Google Scholar 

  • McDonald, R. P. (1996). Path analysis with composite variables. Multivariate Behavioral Research, 31, 239–270.

    Article  PubMed  Google Scholar 

  • McKeon J. J. (1966). Canonical analysis: Some relation between canonical correlation, factor analysis, discriminant analysis, and scaling theory. Psychometric Monograph, 13.

  • Meyer, R. R. (1976). Sufficient conditions for the convergence of monotonic mathematical programming algorithms. Journal of Computer and System Sciences, 12(1), 108–121.

    Article  Google Scholar 

  • Ringle, C. M., Wende, S., & Becker, J.-M. (2015). SmartPLS 3. Bönningstedt: SmartPLS GmbH.

    Google Scholar 

  • Schäfer, J., & Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4(1), Article 32.

  • Smilde, A. K., Westerhuis, J. A., & de Jong, S. (2003). A framework for sequential multiblock component methods. Journal of Chemometrics, 17, 323–337.

    Article  Google Scholar 

  • Steel, R. G. D. (1951). Minimum generalized variance for a set of linear functions. Annals of Mathematical Statistics, 22, 456–460.

    Article  Google Scholar 

  • Ten Berge, J. M. F. (1988). Generalized approaches to the MAXBET problem and the MAXDIFF problem, with applications to canonical correlations. Psychometrika, 53, 487–494.

    Article  Google Scholar 

  • Tenenhaus, M. (2008). Component-based structural equation modelling. Total Quality Management & Business Excellence, 19(7), 871–886.

    Article  Google Scholar 

  • Tenenhaus, A., & Guillemot, V. (2017). RGCCA: Regularized and sparse generalized canonical correlation analysis for multiblock data. http://cran.project.org/web/packages/RGCCA/index.html.

  • Tenenhaus, A., & Tenenhaus, M. (2011). Regularized generalized canonical correlation analysis. Psychometrika, 76, 257–284.

    Article  Google Scholar 

  • Tenenhaus, A., & Tenenhaus, M. (2014). Regularized generalized canonical correlation analysis for multiblock or multigroup data analysis. European Journal of Operational Research, 238, 391–403.

    Article  Google Scholar 

  • Tenenhaus, M., Esposito, Vinzi V., Chatelin, Y.-M., & Lauro, C. (2005). PLS path modeling. Computational Statistics & Data Analysis, 48, 159–205.

    Article  Google Scholar 

  • Tucker, L. R. (1958). An inter-battery method of factor analysis. Psychometrika, 23, 111–136.

    Article  Google Scholar 

  • Van de Geer, J. P. (1984). Linear relations among \(k\) sets of variables. Psychometrika, 49, 70–94.

    Google Scholar 

  • Van den Wollenberg, A. L. (1977). Redundancy analysis—An alternative to canonical correlation analysis. Psychometrika, 42, 207–219.

    Article  Google Scholar 

  • Wangen, L. E., & Kowalski, B. R. (1989). A multiblock partial least squares algorithm for investigating complex chemical systems. Journal of Chemometrics, 3, 3–20.

    Article  Google Scholar 

  • Westerhuis, J. A., Kourti, T., & MacGregor, J. F. (1998). Analysis of multiblock and hierarchical PCA and PLS models. Journal of Chemometrics, 12, 301–321.

    Article  Google Scholar 

  • Widaman, K. F. (1993). Common factor analysis versus principal component analysis: Differential bias in representing model parameters? Multivariate Behavioral Research, 28(3), 263–311.

    Article  PubMed  Google Scholar 

  • Wold, H. (1966). Nonlinear estimation by iterative least square procedures. In F. N. David (Ed.), Festschrift for Jerzy Neyman, Research papers in Statistics (pp. 411–444). London: Wiley.

    Google Scholar 

  • Wold, H. (1982). Soft modeling: The basic design and some extensions. In K. G. Jöreskog & H. Wold (Eds.), Systems under indirect observation, Part 2 (pp. 1–54). Amsterdam: North-Holland.

    Google Scholar 

  • Wold, H. (1985). Partial least squares. In S. Kotz & N. L. Johnson (Eds.), Encyclopedia of statistical sciences (Vol. 6, pp. 581–591). New York: Wiley.

    Google Scholar 

  • Wold, S., Hellberg, S., Lundstedt, T., Sjöström, M., & Wold, H. (1987): PLS modeling with latent variables in two or more dimensions. In Proceedings of the symposium on PLS model building: Theory and application pp. 1–21, Frankfurt am Main.

  • Wold, S., Kettaneh, N., & Tjessem, K. (1996). Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection. Journal of Chemometrics, 10, 463–482.

    Article  Google Scholar 

Download references

Acknowledgements

The authors are very grateful for the insightful and constructive comments made by the editor, the associate editor and the anonymous reviewers that have greatly improved the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arthur Tenenhaus.

Appendix: An algorithm for maximizing a multi-convex continuously differentiable function

Appendix: An algorithm for maximizing a multi-convex continuously differentiable function

We consider a multi-convex continuously differentiable objective function \(f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) :\mathbb {R}^{p_1 }\times \cdots \times \mathbb {R}^{p_J }\rightarrow \mathbb {R}\) (i.e., for each j, f is a convex function of \(\mathbf{v}_j \) while all the other \(\mathbf{v}_k \) are fixed) and the following optimization problem:

$$\begin{aligned} \hbox {Maximize }f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) \hbox { subject to }{} \mathbf{v}_j^t \mathbf{v}_j =1,\hbox { }j=1,\ldots ,J. \end{aligned}$$
(91)

We note \(\nabla _j f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) \) the partial gradient of \(f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) \) with respect to \(\mathbf{v}_j \). We assume \(\nabla _j f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) \ne \mathbf{0}\) in this paper. This assumption is not too binding as \(\nabla _j f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) =\mathbf{0}\) characterizes the global minimum of \(f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) \) with respect to \(\mathbf{v}_j \) when the other vectors \(\mathbf{v}_1 ,\ldots ,\mathbf{v}_{j-1} ,\mathbf{v}_{j+1} ,\ldots ,\mathbf{v}_J \) are fixed. Therefore, we can introduce the unit norm partial gradient \(r_j \left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) =\nabla _j f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) /\left\| {\nabla _j f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) } \right\| \). Canceling the partial gradients of the Lagrangian function of (91) with respect to each \(\mathbf{v}_j \) and taking into account the normalization constraints yields the following stationary equations

$$\begin{aligned} \mathbf{v}_j =r_j \left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_J } \right) , \quad j=1,\ldots ,J. \end{aligned}$$
(92)

Solutions of (92) are called stationary points of (91). We propose the following Algorithm 2 to solve (91):

figure b

Algorithm 2 Gradient algorithm for problem (91). The threshold \(\varepsilon \) is a small positive constant that determines the desired accuracy of the algorithm.

In the case of a single block \((J=1)\), Algorithm 2 is similar to the gradient-based algorithm proposed by Journée, Nesterov, Richtátik and Sepulchre (2010) for maximizing a convex function of several variables with spherical constraints (see Problem 27, p. 529). For studying the convergence properties of Algorithm 2, it is useful to introduce some notations: \(\Omega _j =\left\{ {\mathbf{v}_j \in \mathbb {R}^{p_j };\left\| {\mathbf{v}_j } \right\| =1} \right\} \), \(\Omega =\Omega _1 \times \Omega _2 \times \cdots \times \Omega _J \), \(\mathbf{v}=\left( {\mathbf{v}_1 ,\mathbf{v}_2 ,\ldots ,\mathbf{v}_J} \right) \in \Omega \), \(c_j :\Omega \rightarrow \Omega \) is an operator defined as \(c_j \left( \mathbf{v} \right) =\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_{j-1} ,r_j (\mathbf{v}),\mathbf{v}_{j+1},\ldots ,\mathbf{v}_J} \right) \), and \(c:\Omega \rightarrow \Omega \) is defined as \(c=c_J \circ c_{J-1} \circ \cdots \circ c_1 \). We consider the sequence \(\left\{ {\mathbf{v}^s =\left( {\mathbf{v}_1^s ,\ldots ,\mathbf{v}_J^s } \right) } \right\} \) generated by Algorithm 2. Using the operator c, the “for loop” and equations (93) inside Algorithm 2 can be replaced by the following recurrence relation:

$$\begin{aligned} \mathbf{v}^{s+1}=c\left( {\mathbf{v}^{s}} \right) . \end{aligned}$$
(94)

Note that the set of stationary points of problem (91) is equal to the set of fixed points of c. To study the convergence properties of Algorithm 2, we will consider the infinite sequence \(\left\{ {\mathbf{v}^s } \right\} _{s=0}^\infty \) generated by (94). The convergence properties of Algorithm 2 are summarized in the following proposition:

Proposition 1

Let \(\left\{ {\mathbf{v}^s } \right\} _{s=0}^\infty \) be any sequence generated by the recurrence relation \(\mathbf{v}^{s+1}=c\left( {\mathbf{v}^{s}} \right) \) with \(\mathbf{v}^{0}\in \Omega \). Then, the following properties hold:

  1. (a)

    The sequence \(\left\{ {f\left( {\mathbf{v}^{s}} \right) } \right\} \) is monotonically increasing, and therefore, convergent as f is bounded on \(\Omega \). This result implies the monotonic convergence of Algorithm 2.

  2. (b)

    If the infinite sequence \(\left\{ {f\left( {\mathbf{v}^{s}} \right) } \right\} \) involves a finite number of distinct terms, then the last distinct point satisfies \(c\left( {\mathbf{v}^{s}} \right) =\mathbf{v}^{s}\) and therefore is a stationary point of problem (91).

  3. (c)

    The limit of any convergent subsequence of \(\left\{ {\mathbf{v}^s } \right\} \) is a fixed point of c.

  4. (d)

    \(\lim \limits _{s\rightarrow \infty } f\left( {\mathbf{v}^{s}} \right) =f\left( {\mathbf{v}^{*}} \right) ,\) where \(\mathbf{v}^*\) is a fixed point of c.

  5. (e)

    The sequence \(\left\{ {\mathbf{v}^s =\left( {\mathbf{v}_1^s ,\ldots ,\mathbf{v}_J^s } \right) } \right\} \) is asymptotically regular: \(\lim \limits _{s\rightarrow \infty } \sum \limits _{j=1}^J {\left\| {\mathbf{v}_j^{s+1} -\mathbf{v}_j^s } \right\| } =0\). This result implies that if the threshold \(\varepsilon \) for the stopping criterion in Algorithm 2 is made sufficiently small, the output of Algorithm 2 will be a stationary point of (91).

  6. (f)

    If equation \(\mathbf{v}=c\left( \mathbf{v} \right) \) has a finite number of solutions, then the sequence \(\left\{ {\mathbf{v}^s } \right\} \) converges to one of them.

Note that points (a) and (b) concern the behavior of the sequence of values \(\left\{ {f\left( {\mathbf{v}^{s}} \right) } \right\} \) of the objective function, whereas points (c) to (f) are about the behavior of the sequence \(\left\{ {\mathbf{v}^s =\left( {\mathbf{v}_1^s ,\ldots ,\mathbf{v}_J^s } \right) } \right\} \). The results given in the following lemma are useful for proving Proposition 1.

Lemma 1

Consider the set \(\Omega \), the function \(f:\Omega \rightarrow \mathbb {R}\) and the operator \(c:\Omega \rightarrow \Omega \) defined above. Then, the following properties hold:

  1. (i)

    \(\Omega \) is compact set;

  2. (ii)

    c is a continuous operator;

  3. (iii)

    \(f\left( \mathbf{v} \right) \le f\left( {c\left( \mathbf{v} \right) } \right) \) for any \(\mathbf{v}\in \Omega \);

  4. (iv)

    If \(f\left( \mathbf{v} \right) =f\left( {c\left( \mathbf{v} \right) } \right) \), then \(c\left( \mathbf{v} \right) =\mathbf{v}\).

Proof of Lemma 1

Point (i): \(\Omega \) is compact because it is the Cartesian product of J compact sets.

Point (ii): f being a continuously differentiable function, \(r_j \) is continuous. This implies that \(c_j :\Omega \rightarrow \Omega \) is a continuous operator. The operator \(c=c_J \circ c_{J-1} \circ \cdots \circ c_1 \) is also continuous as composition of J continuous operators.

Point (iii): Let \(\mathbf{v}=\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_j,\ldots ,\mathbf{v}_J} \right) \in \Omega \). First, we want to find an update \({{\hat{\mathbf{v}}}}_j \in \Omega _j \) of \(\mathbf{v}_j\) such that \(f\left( \mathbf{v} \right) \le f\left( {\mathbf{v}_1,\ldots ,\mathbf{v}_{j-1} ,{{\hat{\mathbf{v}}}}_j ,\mathbf{v}_{j+1},\ldots ,\mathbf{v}_J} \right) \). For that purpose, we use the following inequality which states that a convex function lies above its linear approximation at \(\mathbf{v}_j\) for any \({\tilde{\mathbf{v}}}_j \in \Omega _j \):

$$\begin{aligned} f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_{j-1} ,{\tilde{\mathbf{v}}}_j ,\mathbf{v}_{j+1},\ldots ,\mathbf{v}_J} \right) \ge f\left( \mathbf{v} \right) +\left( {\nabla _j f\left( \mathbf{v} \right) } \right) ^{t}\left( {{\tilde{\mathbf{v}}}_j -\mathbf{v}_j} \right) =\ell _j \left( {{\tilde{\mathbf{v}}}_j ,\mathbf{v}} \right) . \end{aligned}$$
(95)

Using the Cauchy–Schwartz inequality, we obtain the unique maximizer \({{\hat{\mathbf{v}}}}_j \in \Omega _j \) of \(\ell _j \left( {{\tilde{\mathbf{v}}}_j ,\mathbf{v}} \right) \) with respect to \({\tilde{\mathbf{v}}}_j \in \Omega _j \):

$$\begin{aligned} {{\hat{\mathbf{v}}}}_j =\arg \mathop {\max }\limits _{{\tilde{\mathbf{v}}}_j \in \Omega _j } \ell _j \left( {{\tilde{\mathbf{v}}}_j ,\mathbf{v}} \right) \hbox { }=\nabla _j f\left( \mathbf{v} \right) /\left\| {\nabla _j f\left( \mathbf{v} \right) } \right\| =r_j \left( \mathbf{v} \right) . \end{aligned}$$
(96)

We deduce from (95) the following inequalities:

$$\begin{aligned} f\left( \mathbf{v} \right) =\ell _j \left( {\mathbf{v}_j ,\mathbf{v}} \right) \le \ell _j \left( {r_j \left( \mathbf{v} \right) ,\mathbf{v}} \right) \le f\left( {\mathbf{v}_1 ,\ldots ,\mathbf{v}_{j-1} ,r_j \left( \mathbf{v} \right) ,\mathbf{v}_{j+1} ,\ldots ,\mathbf{v}_J} \right) =f\left( {c_j \left( \mathbf{v} \right) } \right) .\nonumber \\ \end{aligned}$$
(97)

This implies that updating \(\mathbf{v}_j\) by \({{\hat{\mathbf{v}}}}_j =r_j \left( \mathbf{v} \right) \) increases \(f\left( \mathbf{v} \right) \) (or \(f\left( \mathbf{v} \right) \) stays the same). Moreover, the following inequality is deduced from (97) for each \(j=2,\ldots ,J\):

$$\begin{aligned} f\left( {c_{j-1} \circ \cdots \circ c_1 \left( \mathbf{v} \right) } \right) \le f\left( {c_j \circ c_{j-1} \circ \cdots \circ c_1 \left( \mathbf{v} \right) } \right) . \end{aligned}$$
(98)

This yields the desired inequalities for any \(\mathbf{v}\in \Omega \):

$$\begin{aligned} f\left( \mathbf{v} \right) \le f\left( {c_1 \left( \mathbf{v} \right) } \right) \le f\left( {c_2 \circ c_1 \left( \mathbf{v} \right) } \right) \le \cdots \le f\left( {c_J \circ \cdots \circ c_1 \left( \mathbf{v} \right) } \right) =f\left( {c\left( \mathbf{v} \right) } \right) . \end{aligned}$$
(99)

Point (iv): If \(f\left( \mathbf{v} \right) =f\left( {c\left( \mathbf{v} \right) } \right) \) for \(\mathbf{v}\in \Omega \), then (99) implies

$$\begin{aligned} f\left( \mathbf{v} \right) =f\left( {c_1 \left( \mathbf{v} \right) } \right) =f\left( {c_2 \circ c_1 \left( \mathbf{v} \right) } \right) =\cdots =f\left( {c_J \circ \cdots \circ c_1 \left( \mathbf{v} \right) } \right) =f\left( {c\left( \mathbf{v} \right) } \right) . \end{aligned}$$
(100)

Using (97), the equality \(f\left( \mathbf{v} \right) =f\left( {c_1 \left( \mathbf{v} \right) } \right) \) implies \(\ell _1 \left( {r_1 \left( \mathbf{v} \right) ,\mathbf{v}} \right) =\ell _1 \left( {\mathbf{v}_1 ,\mathbf{v}} \right) \) and therefore \(\mathbf{v}_1 =r_1 \left( \mathbf{v} \right) \) as \(r_1 (\mathbf{v})\) is the unique maximizer of \(\ell _1 \left( {{\tilde{\mathbf{v}}}_1 ,\mathbf{v}} \right) \) with respect to \({\tilde{\mathbf{v}}}_1 \in \Omega _1 \). From this result, we deduce \(\mathbf{v}=\left( {\mathbf{v}_1 ,\mathbf{v}_2 ,\ldots ,\mathbf{v}_J} \right) =\left( {r_1 (\mathbf{v}),\mathbf{v}_2 ,\ldots ,\mathbf{v}_J} \right) =c_1 \left( \mathbf{v} \right) \) and then, by transitivity,

$$\begin{aligned} \mathbf{v}=c_1 \left( \mathbf{v} \right) =c_2 \circ c_1 \left( \mathbf{v} \right) =\cdots =c_J \circ \cdots \circ c_1 \left( \mathbf{v} \right) =c\left( \mathbf{v} \right) . \end{aligned}$$
(101)

Proof of Proposition 1

Point (a): Point (iii) of Lemma 1 implies that the sequence \(\left\{ {f\left( {\mathbf{v}^s } \right) } \right\} \) is monotonically increasing, and therefore, convergent as the continuous function f is bounded on the compact set \(\Omega \).

Point (b): If the infinite sequence \(\left\{ {f\left( {\mathbf{v}^s } \right) } \right\} \) has a finite number of distinct terms, it cannot be a strictly increasing sequence and consequently there exists some integer M such that \(f\left( {\mathbf{v}^0 } \right)<f\left( {\mathbf{v}^1 } \right)<\cdots <f\left( {\mathbf{v}^M } \right) =f\left( {\mathbf{v}^{M+1} } \right) \). Then, Point (iv) of Lemma 1 implies that \(\mathbf{v}^M \) is a fixed point of c.

Points (c) to (f): They are deduced from a direct application of Meyer’s monotone convergence theorem (Theorem 3.1 in Meyer, 1976). This theorem gives quite general conditions under which a sequence \(\left\{ {\mathbf{v}^s } \right\} \) produced by an algorithm that monotonically increases a continuous objective function will converge. Meyer considered the case of a point-to-set operator \(c:\Omega \rightarrow \mathscr {P}\left( \Omega \right) \), where \(\mathscr {P}\left( \Omega \right) \) is the set of all nonempty subsets of \(\Omega \). In this paper, c is a point-to-point operator and the conditions of Meyer’s theorem reduce to the three following conditions (see Fessler, 2004): (1) c is a continuous operator; (2) c is strictly monotone (increasing) with respect to f; and (3) c is uniformly compact on \(\Omega \). Condition (2) means that points (iii) and (iv) of Lemma 1 are verified. Condition (3) means that there exists a compact set K such that \(c\left( \mathbf{v} \right) \in K\) for all \(\mathbf{v}\in \Omega \). According to Lemma 1, these three conditions are satisfied for Algorithm 2 and therefore, Meyer’s theorem can be applied to any sequence \(\left\{ {\mathbf{v}^{s}} \right\} \) produced by the recurrence equation \(\mathbf{v}^{s+1}=c\left( {\mathbf{v}^{s}} \right) \) with \(\mathbf{v}^{0}\in \Omega \). Thus, the following results are obtained:

  1. (1)

    All subsequence limit points of \(\left\{ {\mathbf{v}^{s}} \right\} \) are fixed points of c;

  2. (2)

    \(f\left( {\mathbf{v}^{s}} \right) \rightarrow f\left( {\mathbf{v}^{*}} \right) ,\) where \(\mathbf{v}^{*}\) is a fixed point of c;

  3. (3)

    \(\sum \limits _{j=1}^J {\left\| {\mathbf{v}_j^{s+1} -\mathbf{v}_j^s } \right\| } \rightarrow 0;\)

  4. (4)

    Either \(\left\{ {\mathbf{v}^{s}} \right\} \) converges or the subsequence limit points of \(\left\{ {\mathbf{v}^{s}} \right\} \) form a continuum.

Point (4) implies that if equation \(\mathbf{v}=c\left( \mathbf{v} \right) \) admits a finite number of solutions, the sequence \(\left\{ {\mathbf{v}^{s}} \right\} \) converges to one of them.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tenenhaus, M., Tenenhaus, A. & Groenen, P.J.F. Regularized Generalized Canonical Correlation Analysis: A Framework for Sequential Multiblock Component Methods. Psychometrika 82, 737–777 (2017). https://doi.org/10.1007/s11336-017-9573-x

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-017-9573-x

Keywords

Navigation