Abstract
The use of Candecomp to fit scalar products in the context of INDSCAL is based on the assumption that the symmetry of the data matrices involved causes the component matrices to be equal when Candecomp converges. Ten Berge and Kiers gave examples where this assumption is violated for Gramian data matrices. These examples are believed to be local minima. It is now shown that, in the single-component case, the assumption can only be violated at saddle points. Chances of Candecomp converging to a saddle point are small but still nonzero.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Carroll and Chang (1970) developed Candecomp as a method of fitting scalar products derived from INDSCAL. Specifically, they sought to minimize the function
, where S i is a symmetric p × p matrix of (true or pseudo) scalar products, X is a p × r matrix of components, and D i is a diagonal r × r matrix of saliencies, with elements of row i of D in the diagonal, i = 1, …, m. Because direct minimization of g seems difficult, Carroll and Chang instead proposed Candecomp, to minimize
. They assumed that, at the minimum, the symmetry of the slices S i will cause X and Y to be equal or at least column-wise proportional. In the latter case, columns of Y can be rescaled to become equal to those of X, the inverse scaling being applied to columns of D. The Candecomp algorithm minimizes f by alternately optimizing X conditionally for fixed Y and D, optimizing Y conditionally for fixed X and D, and optimizing D conditionally for fixed X and Y. In practice, the claim that symmetry of the slices will render X and Y column-wise proportional at the minimum seems warranted. However, for contrived data, counterexamples do exist. In particular, the single-component case (r = 1) has played a role in contriving counterexamples.
Ten Berge and Kiers (1991) examined the single-component case, where X, Y, and D can be represented as vectors x, y, d. They constructed Gramian matrices S 1 and S 2, and a solution {x, y,d} for which the derivatives of f vanish, yet the claim that x and y will be proportional is unwarranted. They also showed that such cases could not possibly represent a global minimum for f, and conjectured that they could arise only at local minima. However, they seem to have ignored saddle points as another possibility. In the present note it is shown that, when S 1, …,S m are Gramian, nonproportional components in the single-component case can be obtained only for saddle points. This is more than merely a theoretical exercise: Because saddle points seem far more difficult to attain by Candecomp than local minima, the practical relevance of counterexamples where x and y are nonproportional is thus further reduced. We start by examining derivatives. Throughout, we use Σ to denote ∑ m i=1 .
2 First- and Second-Order Derivatives of f
The conditionally optimal solutions for X, Y, and D are those that satisfy the equations
,
, and
, where * is the element-wise (Hadamard) product of matrices, and Diag(Vec(D i )) the vector holding the diagonal elements of D i . In the r = 1 case, these equations simplify to
,
, and
, where d i is the only element of D i . The second-order derivative matrix for the r = 1 case is
, where d is the vector holding d 1, …, d m . When this matrix is positive definite (i.e., having all eigenvalues positive), we have a local or global minimum. When it is indefinite (i.e., having positive and negative eigenvalues), we have a saddle point. Consider the following contrived data and solution of Ten Berge and Kiers (1991), with x and y nonproportional. Let
. It can be verified that the first-order derivatives vanish. Ten Berge and Kiers (1991, p. 324) conjectured from their Result 6 that such cases with x and y nonproportional occur only at local minima of f. However, the matrix of second-order derivatives (9) is
with eigenvalues 13.04, 10, 8, 6, 2, 0, 0, and −11.04, so it is indefinite. Therefore, we are looking at a saddle point. This is not just a property of this particular example. All stationary points of f with x and y nonproportional are saddle points. To prove this, we might try to show that (9) is indefinite whenever (6), (7), and (8) are satisfied for a nonproportional pair x and y. It is, however, much easier to use an alternative approach to minimizing f.
3 An Alternative Approach for the r =1 Case
Instead of using Candecomp to minimize f, the r = 1 case can be solved by constraining x and y to be of unit length, and expressing d i as x′S i y, i = 1, …, m, see (8). Then minimizing f is reduced to maximizing
subject to the constraint x′x = y′y = 1. Clearly, for y fixed, x can be optimized conditionally as the eigenvector associated with the largest eigenvalue of ∑S i yy′S i and, for x fixed, the conditionally optimal y is the eigenvector associated with the largest eigenvalue of ∑S i xx′S i . Because d is updated implicitly, this algorithm optimizes {x,d} and {y,d}, iteratively. The first-order derivatives of the Lagrangian \( L\left( {x,y} \right) = \sum {\left( {x'S_i y} \right)^2 - \lambda _1 \left( {x'x - 1} \right) - } \lambda _2 \left( {y'y - 1} \right) \) are
, and
, and the matrix of second-order derivatives is
. It is immediate from (12) and (13) that λ 1 = λ 2 = (x′S i y)2 when first-order derivatives vanish. Let the columns of a 2p × 2 matrix W span the subspace orthogonal to the columns of \(\left[ {\matrix{ {2{\rm{x}}} & 0 \cr 0 & {2{\rm{y}}} \cr } } \right]\). When the first-order derivatives of L vanish, we have a maximum if \( W'\frac{{\partial ^2 L}} {{\partial x\partial y}}W \) is negative definite, and a saddle point if it is indefinite. For the data of (10), that matrix is
, with eigenvalues 72, −2,−6, and −8. It is indefinite, implying a saddle point like before. It will now be shown that all cases where x and y are nonproportional are saddle points of h.
Result. When a solution for h(x, y) is a local maximum, it has conditional optimality. Such a solution has x=±y when any subset of S 1, …,S m are Gramian and admit a nonsingular linear combination.
Proof: Suppose the solution (x, y) is a local maximum of h. Then x is a local maximum of h y (z) = z′∑S i yy′S i z, with y fixed. Define A y = ∑S i yy′S i , with eigendecomposition A y = KΛK′, where Λ is the diagonal matrix of eigenvalues λ 1 ≥ λ 2 ≥ … ≥ λ m . Define u = K′z. Then u is a local maximum of u′Λu subject to u′u = 1. Because the first-order derivatives vanish we have
for some eigenvalue λ and eigenvector u of Λ. The second derivatives matrix is 2(Λ − λ I). At a local maximum, it is at least negative semidefinite on the subspace orthogonal to u. That is, for all h orthogonal to u we have h′(Λ−λ I)h ≤ 0. When we pick the largest eigenvalue and associated eigenvector e 1, the first column of the identity matrix, this inequality is satisfied. However, when we pick any other eigenvalue and associated eigenvector, e′ 1(Λ − λ I)e 1 will be strictly positive. It follows that a local maximum has u = e 1 hence x equal to the principal eigenvector of A y . That is, x is conditionally optimal for h given y. The entire derivation can be repeated with roles of x and y reversed, showing that we have conditional optimality also for y at local maxima of h. Finally, Result 6 of Ten Berge and Kiers (1991) implies that every solution with conditional optimality for x and y has (x′S i x)2 = (y′S i y)2. When any subset of S 1, …, S m are Gramian, and admit a nonsingular linear combination, it follows that x=±y.
The meaning of this result is that at stationary points of h in the case of Gramian matrices, x and y can be nonproportional only at saddle points. This also extends to the more general function f. When that function has a local minimum, h has a local maximum, which means that x=±y is guaranteed. As a result, a comment in Ten Berge and Kiers (1991, p. 324) can now be sharpened: Where it was stated that asymmetry in the present examples can occur only at local maxima, it is now clear that asymmetry can occur only at saddle points. This fact reduces the probability of ever finding asymmetry in practical applications. Still, it does happen now and then that Candecomp converges to saddle points. For instance, when the data are S 1 and S 2 of (13), we have a saddle point at
. When Candecomp is started at points where x is replaced by x 0+t × rand (−0.5,+0.5), updating y and d first, and t is picked as small as 0.0001, it converges to the saddle point in 7% of the cases.
4 Discussion
The assumption that Candecomp, when applied to symmetric slices, will converge to solutions with two component matrices equal up to column scaling remains unsettled for r > 1. However, the counterexamples presented by Ten Berge and Kiers are saddle points rather than local minima. This is reassuring, because Candecomp, for r = 1, tends to get stuck at local minima much easier than at saddle points. Still, the probability of convergence to a saddle point, when Candecomp starts very close to one, is strictly positive for r = 1.
References
Carroll, J.D., & Chang, J.J. (1970). Analysis of individual differences in multidimensional scaling via an n-way generalization of Eckart-Young decomposition. Psychometrika, 35, 283–19.
Ten Berge, J.M.F., & Kiers, H.A.L. (1991). Some clarifications of the Candecomp algorithm applied to Indscal. Psychometrika, 56, 317–26.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Dosse, M.B., Berge, J.M.F. The Assumption of Proportional Components when Candecomp is Applied to Symmetric Matrices in the Context of Indscal. Psychometrika 73, 303–307 (2008). https://doi.org/10.1007/s11336-007-9044-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-007-9044-x