# Simultaneous Diagonalization Algorithms with Applications in Multivariate Statistics

• Bernard D. Flury
• Beat E. Neuenschwander
Chapter
Part of the ISNM International Series of Numerical Mathematics book series (ISNM, volume 119)

## Abstract

The following problem arises from multivariate statistical models in principal component and canonical correlation analysis. Let
$$S\, = \,\,\left[ \begin{gathered} {S_{11}}\,\,\,\,\, \cdots \,\,\,\,\,{S_{1k}}\, \hfill \\ \vdots \,\,\,\,\,\,\,\,\,\, \ddots \,\,\,\,\,\, \vdots \hfill \\ {S_{k1\,}}\,\,\,\, \cdots \,\,\,\,\,\,{S_{kk}} \hfill \\\end{gathered} \right]$$
denote a positive definite symmetric (pds) matrix of dimension pk × pk, partitioned into submatrices S ij of dimension p × p each, and suppose we wish to find a nonsingular p × p matrix B such that all B′S ij B are “almost diagonal”. More precisely, for a partitioned pk × pk matrix
$$A = \left[ \begin{gathered} {A_{11\,\,\,\,\,}} \cdots \,\,\,\,\,{A_{1k}} \hfill \\ \,\,\,\vdots \,\,\,\,\,\, \ddots \,\,\,\,\,\,\,\, \vdots \hfill \\ {A_{k1\,\,\,}}\, \cdots \,\,\,\,\,{A_{kk}} \hfill \\\end{gathered} \right]$$
we define the parallel-diagonal operator as
$$pdiag (A)= \left[ \begin{gathered} diag\left( {{A_{11}}} \right)\,\,\,\, \cdots \,\,\,\,diag\left( {{A_{1k}}} \right) \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\vdots\,\,\,\,\,\,\,\,\,\,\, \ddots \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \hfill \\ diag\left( {{A_{k1}}} \right)\,\,\,\, \cdots \,\,\,\,diag\left( {{A_{kk}}} \right) \hfill \\ \end{gathered} \right]$$
and suggest to use det{pdiag(A)}/det(A) as a measure of deviation from “parallel-diagonality”, provided A is pds. For a nonsingular p × p matrix B, we study the function
$$\Phi \left( {B;{\kern 1pt} \,S} \right)\, = \,\frac{{\det \,\left[ {pdag\left\{ {{{\left( {{I_k} \otimes \,B} \right)}^\prime }S\,\left( {{I_k}\, \otimes \,B} \right)} \right\}} \right]}}{{\det \left[ {{{\left( {{I_k}\, \otimes \,B} \right)}^\prime }S\left( {{I_k}\, \otimes \,B} \right)} \right]}}$$

The matrix B which minimizes Ф is said to transform S to almost parallel-diagonal form. We give an algorithm for minimizing Ф over B in (i) the group of orthogonal p × p matrices, and (ii) the set of nonsingular p × p matrices such that diag(B′B) = I p , and study its convergence. Statistical applications of the algorithm occur in maximum likelihood estimation of (i) common principal components for dependent random vectors (Neuenschwander 1994), and (ii) common canonical variates (Neuenschwander and Flury 1994). This work generalizes and extends the FG diagonalization algorithm of Flury and Gautschi (1986).

## Keywords

Canonical Correlation Analysis Orthogonal Matrix Input Matrix Multivariate Statistical Model Positive Definite Symmetric Matrice
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## References

1. [1]
DeLeeuw J., Pruzansky S. A new computational method for the weighted Euclidean distance model. Psychometrika, 43: 479–490, 1978.
2. [2]
Flury B. Common Principal Components and Related Multivariate Models. Wiley, New York, 1988.
3. [3]
Flury B., Gautschi W. An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form. SIAM Journal on Scientific and Statistical Computing, 7: 169–184, 1986.
4. [4]
Flury B., Neuenschwander B. E. Principal component models for patterned covariance matrices, with applications to canonical correlation analysis of several sets of variables. In Descriptive Multivariate Analysis, Oxford University Press, 1994. W. J. Krzanowski, ed. In press.Google Scholar
5. [5]
Golub G. H., Van Loan C. F. Matrix Computations. The Johns Hopkins University Press, Baltimore, 1983.
6. [6]
Graybill F. A. Introduction to Matrices with Applications in Statistics. Wadsworth, Belmont (CA ), 1969.Google Scholar
7. [7]
Henderson H. V., Searle S. R. Vec and vech operators for matrices, with some uses in Jacobians and multivariate statistics. Canadian Journal of Statistics, 7: 65–81, 1979.
8. [8]
Magnus J. R. Linear Structures. Charles Griffin amp; Co., London, 1988.
9. [9]
Magnus J. R., Neudecker H. Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley, New York, 1988.
10. [10]
Neuenschwander B. E. Common Principal Components for Dependent Random Vectors. Unpublished PhD thesis, University of Bern ( Switzerland ), Dept. of Statistics, 1991.Google Scholar
11. [11]
Neuenschwander B. E. A common principal component model for dependent random vectors, 1994. Submitted for publication.Google Scholar
12. [12]
Neuenschwander B. E., Flury B. Common canonical variates, 1994. Submitted for publication.Google Scholar
13. [13]
Searle S. R. Matrix Algebra Useful for Statistics. Wiley, New York, 1982.