Simultaneous Diagonalization Algorithms with Applications in Multivariate Statistics

  • Bernard D. Flury
  • Beat E. Neuenschwander
Part of the ISNM International Series of Numerical Mathematics book series (ISNM, volume 119)


The following problem arises from multivariate statistical models in principal component and canonical correlation analysis. Let
$$S\, = \,\,\left[ \begin{gathered} {S_{11}}\,\,\,\,\, \cdots \,\,\,\,\,{S_{1k}}\, \hfill \\ \vdots \,\,\,\,\,\,\,\,\,\, \ddots \,\,\,\,\,\, \vdots \hfill \\ {S_{k1\,}}\,\,\,\, \cdots \,\,\,\,\,\,{S_{kk}} \hfill \\\end{gathered} \right]$$
denote a positive definite symmetric (pds) matrix of dimension pk × pk, partitioned into submatrices S ij of dimension p × p each, and suppose we wish to find a nonsingular p × p matrix B such that all B′S ij B are “almost diagonal”. More precisely, for a partitioned pk × pk matrix
$$A = \left[ \begin{gathered} {A_{11\,\,\,\,\,}} \cdots \,\,\,\,\,{A_{1k}} \hfill \\ \,\,\,\vdots \,\,\,\,\,\, \ddots \,\,\,\,\,\,\,\, \vdots \hfill \\ {A_{k1\,\,\,}}\, \cdots \,\,\,\,\,{A_{kk}} \hfill \\\end{gathered} \right]$$
we define the parallel-diagonal operator as
$$pdiag (A)= \left[ \begin{gathered} diag\left( {{A_{11}}} \right)\,\,\,\, \cdots \,\,\,\,diag\left( {{A_{1k}}} \right) \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\vdots\,\,\,\,\,\,\,\,\,\,\, \ddots \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \hfill \\ diag\left( {{A_{k1}}} \right)\,\,\,\, \cdots \,\,\,\,diag\left( {{A_{kk}}} \right) \hfill \\ \end{gathered} \right]$$
and suggest to use det{pdiag(A)}/det(A) as a measure of deviation from “parallel-diagonality”, provided A is pds. For a nonsingular p × p matrix B, we study the function
$$\Phi \left( {B;{\kern 1pt} \,S} \right)\, = \,\frac{{\det \,\left[ {pdag\left\{ {{{\left( {{I_k} \otimes \,B} \right)}^\prime }S\,\left( {{I_k}\, \otimes \,B} \right)} \right\}} \right]}}{{\det \left[ {{{\left( {{I_k}\, \otimes \,B} \right)}^\prime }S\left( {{I_k}\, \otimes \,B} \right)} \right]}}$$

The matrix B which minimizes Ф is said to transform S to almost parallel-diagonal form. We give an algorithm for minimizing Ф over B in (i) the group of orthogonal p × p matrices, and (ii) the set of nonsingular p × p matrices such that diag(B′B) = I p , and study its convergence. Statistical applications of the algorithm occur in maximum likelihood estimation of (i) common principal components for dependent random vectors (Neuenschwander 1994), and (ii) common canonical variates (Neuenschwander and Flury 1994). This work generalizes and extends the FG diagonalization algorithm of Flury and Gautschi (1986).


Canonical Correlation Analysis Orthogonal Matrix Input Matrix Multivariate Statistical Model Positive Definite Symmetric Matrice 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    DeLeeuw J., Pruzansky S. A new computational method for the weighted Euclidean distance model. Psychometrika, 43: 479–490, 1978.MathSciNetMATHCrossRefGoogle Scholar
  2. [2]
    Flury B. Common Principal Components and Related Multivariate Models. Wiley, New York, 1988.MATHGoogle Scholar
  3. [3]
    Flury B., Gautschi W. An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form. SIAM Journal on Scientific and Statistical Computing, 7: 169–184, 1986.MathSciNetMATHCrossRefGoogle Scholar
  4. [4]
    Flury B., Neuenschwander B. E. Principal component models for patterned covariance matrices, with applications to canonical correlation analysis of several sets of variables. In Descriptive Multivariate Analysis, Oxford University Press, 1994. W. J. Krzanowski, ed. In press.Google Scholar
  5. [5]
    Golub G. H., Van Loan C. F. Matrix Computations. The Johns Hopkins University Press, Baltimore, 1983.MATHGoogle Scholar
  6. [6]
    Graybill F. A. Introduction to Matrices with Applications in Statistics. Wadsworth, Belmont (CA ), 1969.Google Scholar
  7. [7]
    Henderson H. V., Searle S. R. Vec and vech operators for matrices, with some uses in Jacobians and multivariate statistics. Canadian Journal of Statistics, 7: 65–81, 1979.MathSciNetMATHCrossRefGoogle Scholar
  8. [8]
    Magnus J. R. Linear Structures. Charles Griffin amp; Co., London, 1988.MATHGoogle Scholar
  9. [9]
    Magnus J. R., Neudecker H. Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley, New York, 1988.MATHGoogle Scholar
  10. [10]
    Neuenschwander B. E. Common Principal Components for Dependent Random Vectors. Unpublished PhD thesis, University of Bern ( Switzerland ), Dept. of Statistics, 1991.Google Scholar
  11. [11]
    Neuenschwander B. E. A common principal component model for dependent random vectors, 1994. Submitted for publication.Google Scholar
  12. [12]
    Neuenschwander B. E., Flury B. Common canonical variates, 1994. Submitted for publication.Google Scholar
  13. [13]
    Searle S. R. Matrix Algebra Useful for Statistics. Wiley, New York, 1982.MATHGoogle Scholar

Copyright information

© Birkhäuser 1994

Authors and Affiliations

  • Bernard D. Flury
    • 1
  • Beat E. Neuenschwander
    • 2
  1. 1.Department of MathematicsIndiana UniversityBloomingtonUSA
  2. 2.Institut für Sozial und PräventivmedizinUniversität BernBernSwitzerland

Personalised recommendations