, Volume 60, Issue 1, pp 47–75 | Cite as

A parametric procedure for ultrametric tree estimation from conditional rank order proximity data

  • Martin R. Young
  • Wayne S. DeSarbo


The psychometric and classification literatures have illustrated the fact that a wide class of discrete or network models (e.g., hierarchical or ultrametric trees) for the analysis of ordinal proximity data are plagued by potential degenerate solutions if estimated using traditional nonmetric procedures (i.e., procedures which optimize a STRESS-based criteria of fit and whose solutions are invariant under a monotone transformation of the input data). This paper proposes a new parametric, maximum likelihood based procedure for estimating ultrametric trees for the analysis of conditional rank order proximity data. We present the technical aspects of the model and the estimation algorithm. Some preliminary Monte Carlo results are discussed. A consumer psychology application is provided examining the similarity of fifteen types of snack/breakfast items. Finally, some directions for future research are provided.

Key words

hierarchical clustering proximity data conditional rank orders maximum likelihood estimation consumer psychology 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Abe, M. (1993). Issues in maximum likelihood multidimensional scaling (Working Paper). Chicago, IL: University of Illinois.Google Scholar
  2. Barthélemy, J. P. & Guénoche A. (1991).Trees and proximity representations. New York: John Wiley & Sons.Google Scholar
  3. Bozdogan, H. (1987). Model selection and Akaike's information criterion (AIC): The general theory and its analytical extensions.Psychometrika, 52, 345–370.Google Scholar
  4. Carroll, J. D. (1976). Spatial, non-spatial and hybrid models for scaling.Psychometrika, 41, 439–63.Google Scholar
  5. Carroll, J. D. (1989). Degenerate solutions in the non-metric fitting of a wide class of models for proximity data (Technical Memorandum). Murray Hill, NJ: Bell Laboratories.Google Scholar
  6. Carroll, J. D., & Arabie P. (1980). Multidimensional scaling.Annual Review of Psychology, 31, 607–649.Google Scholar
  7. Carroll, J. D., & Chang, J. J. (1970). Analysis of individual differences in multidimensional scaling via anN-way generalization of “Eckart-Young” decomposition.Psychometrika, 35, 285–319.Google Scholar
  8. Carroll, J. D., Clark, L., & DeSarbo, W. S. (1984). The representation of three-way proximities data by single and multiple tree structure models.Journal of Classification, 1, 25–74.Google Scholar
  9. Carroll, J. D., & Pruzansky, S. (1975, August).Fitting of hierarchical tree structure (HTS) models, mixtures of HTS models, and hybrid models, via mathematical programming and alternating least squares. Paper presented at U.S.-Japan Seminar of Multidimensional Scaling, University of California at San Diego, La Jolla, California.Google Scholar
  10. Carroll, J. D., & Pruzansky S. (1980). Discrete and hybrid scaling models. In E. D. Lantermann & H. Feger, (Eds.),Similarity and choice (pp. 48–69). Bern: Hans Huber.Google Scholar
  11. Chandon, J. L., Lemaire, J., & Pouget, J. (1980). Construction de l'ultramétrique la plus proche d'une dissimilarité au sens des moindres carrés [The construction of ultrametric trees from dissimilarity matrices].R.A.I.R.O., Recherche Operationelle, 14, 157–170.Google Scholar
  12. Chapman, R., & Staelin, R. (1982). Exploiting rank ordered choice set data within the stochastic utility model.Journal of Marketing Research, 19, 288–301.Google Scholar
  13. Coombs, C. H. (1964).A theory of data. New York: John Wiley & Sons.Google Scholar
  14. Corter, J. E., & Tversky, A. (1986). Extended similarity trees.Psychometrika, 51, 429–451.Google Scholar
  15. Cox, D. R. (1972). Regression models and life tables (with discussion).Journal of the Royal Statistical Society, Series B, 34, 187–202.Google Scholar
  16. Critchlow, D. E., Filgner, M. A., & Verducci, J. S. Probability models on rankings.Journal of Mathematical Psychology, 35, 294–318.Google Scholar
  17. Cunningham, J. P. (1974, August). Finding the optimal tree realization of a proximity matrix. Paper presented at the Mathematical Psychology Meetings, Ann Arbor, MI.Google Scholar
  18. Cunningham, J. P. (1978). Free trees and bidirectional trees as a representation of psychological distance.Journal of Mathematical Psychology, 17, 165–188.Google Scholar
  19. DeSarbo, W. S., De Soete, G., Carroll, J. D., & Ramaswamy, V. (1988). A new stochastic ultrametric tree unfolding methodology for assessing competitive market structure and deriving market segments.Applied Stochastic Models & Data Analysis, 4, 185–204.Google Scholar
  20. DeSarbo, W. S., Manrai, A. K., & Burke, R. (1990). A non-spatial methodology for the analysis of a two-way proximity data incorporating the distance-density hypothesis.Psychometrika, 55, 229–253.Google Scholar
  21. DeSarbo, W. S., Manrai, A., & Manrai, L. (in press). Mathematical programming approaches for the nonspatial assessment of competitive market structure: An integrated review of the marketing and psychometric literature. In G. Lilien & J. Eliashberg (Eds.),Marketing models. New York: Kluwer Pub.Google Scholar
  22. De Soete, G. (1983a). Are nonmetric additive-tree representations of numerical proximity data meaningful?Quality and Quantity, 13, 475–478.Google Scholar
  23. De Soete, G. (1983b). A least-squares algorithm for fitting trees to proximity data.Psychometrika, 48, 621–26.Google Scholar
  24. De Soete, G. (1984). A least-squares algorithm for fitting an ultrametric tree to a dissimilarity matrix.Pattern Recognition Letters, 2, 133–37.Google Scholar
  25. De Soete, G., Carroll, J. D., & DeSarbo, W. S. (1987). Least squares algorithms for constructing constrained ultrametric and additive tree representations of symmetric proximity data.Journal of Classification, 4, 155–74.Google Scholar
  26. De Soete, G., DeSarbo, W. S., Furnas, G. W., & Carroll, J. D. (1984a). Tree representations of rectangular proximity matrices. In E. Degreef & J. Van Buggenhaut (Eds.),Trends in mathematical psychology. Amsterdam: North-Holland.Google Scholar
  27. De Soete, G., DeSarbo, W. S., Furnas, G. W., & Carroll, J. D. (1984b). The estimation of ultrametric and path length trees from rectangular proximity data.Psychometrika, 49, 289–310.Google Scholar
  28. Dobson, J. (1974). Unrooted trees for numerical taxonomy.Journal of Applied Probability, 11, 32–42.Google Scholar
  29. Dubes, R., & Jain, A. K. (1979). Validity studies in clustering methodologies.Pattern Recognition, 11, 235–254.Google Scholar
  30. Farris, J. S. (1972). Estimating phylogenetic trees from distance matrices.American Naturalist, 106, 645–668.Google Scholar
  31. Fiacco, A. V., & McCormick, G. P. (1968).Nonlinear programming. New York: John Wiley & Sons.Google Scholar
  32. Fletcher, R. (1987).Practical methods of optimization (2nd. ed.). New York: John Wiley & Sons.Google Scholar
  33. Fligner, M. S., & Verducci, J. S. (1988). Multistage ranking models.Journal of the American Statistical Association, 83, 892–901.Google Scholar
  34. Fligner, M. A., & Verducci, J. S. (1993).Probability models and statistical analyses for ranking data. New York: Springer-Verlag.Google Scholar
  35. Furnas, G. W. (1980).Objects and their features: The metric representation of two class data. Unpublished doctoral dissertation, Stanford University.Google Scholar
  36. Green, P. E., & Rao, V. R. (1972).Multidimensional scaling. Hinsdale, IL: Dryden Press.Google Scholar
  37. Guttman, L. A. (1968). A general nonmetric technique for finding the smallest coordinate space for a configuration of points.Psychometrika, 33, 469–506.Google Scholar
  38. Hartigan, J. A. (1967). Representation of similarity matrices by trees.Journal of the American Statistical Association, 62, 1140–1156.Google Scholar
  39. Hartigan, J. A. (1975).Clustering algorithms. New York: John Wiley & Sons.Google Scholar
  40. Hausman, J. A., & Ruud, P.A. (1987). Specifying and testing econometric models for rank-ordered data.Journal of Econometrics, 34, 83–104.Google Scholar
  41. Holman, E. W. (1972), The relation between hierarchical and Euclidean models for psychological distances.Psychometrika, 37, 417–23.Google Scholar
  42. Jardine, C. J., Jardine, N., & Sibson, R. (1967). The structure and construction of taxonomic hierarchies.Mathematical BioScience, 1, 173–79.Google Scholar
  43. Johnson, S. C. (1967). Hierarchical clustering schemes.Psychometrika, 32, 241–254.Google Scholar
  44. Kalbfleisch, J. D., & Prentice, R. L. (1973). Marginal likelihoods based on Cox's regression and life model.Biometrika, 60, 267–279.Google Scholar
  45. Katahira, H. (1990). Perceptual mapping using ordered logit analysis.Marketing Science, 9(Winter, 1), 1–17.Google Scholar
  46. Keener, R. W., & Waldman, D. M. (1985). Maximum likelihood regression of rank censored data.Journal of the American Statistical Association, 80, 385–392.Google Scholar
  47. Kruskal, J. B. (1964). Nonmetric multidimensional scaling: A numerical method.Psychometrika, 29, 115–129.Google Scholar
  48. Lawless, J. F. (1982).Statistical models and methods for lifetime data. New York: John Wiley & Sons.Google Scholar
  49. Lehmann, E. L. (1975).Nonparametrics: Statistical methods based on ranks. San Francisco, CA: Holden-Day.Google Scholar
  50. Panier, E. R., & Tits, A. L. (1993). On combining feasibility, descent, and superlinear convergence in inequality constrained optimization.Mathematical Programming, 59, 261–276.Google Scholar
  51. Peto, R. (1972). Discussion of paper by D. R. Cox.Journal of the Royal Statistical Society, Series B, 34, 205–207.Google Scholar
  52. Powell, M. J. D. (1977). Restart procedures for the conjugate gradient method.Mathematical Programming, 12, 241–254.Google Scholar
  53. Powell, M. J. D. (1983). ZQPCVX, A FORTRAN subroutine for convex programming (Report DAMTP/1983/NA17). Cambridge: University of Cambridge, England.Google Scholar
  54. Press, W. H., Flannery, B. P., Teukolsky, S. A., & Vetterling, W. T. (1992).Numerical recipes in C. New York: Cambridge University Press.Google Scholar
  55. Pruzansky, S., Tversky, A., & Carroll, J. D. (1982). Spatial versus tree representations of proximity data.Psychometrika, 47, 3–24.Google Scholar
  56. Ramaswamy, V., & DeSarbo, W. S. (1990). SCULPTRE: A new methodology for deriving and analyzing hierarchical product-market structures from panel data.Journal of Marketing Research, 27, 418–427.Google Scholar
  57. Ramsay, J. O. (1977). Maximum likelihood estimation in multidimensional scaling.Psychometrika, 42, 241–266.Google Scholar
  58. Ramsay, J. O. (1982). Some statistical approaches to multidimensional scaling (with discussion).Journal of the Royal Statistical Society, Series A, 145, 285–312.Google Scholar
  59. Roskam, E. E. (1970). The methods of triads for multidimensional scaling.Nederlands Tijdschrift Voor de Psychologie en haar grensgebieden, 25, 404–417.Google Scholar
  60. Ryan, D. M. (1974). Penalty and barrier functions. In P. E. Gill & W. Murray (Eds.),Numerical methods for constrained optimization (pp. 175–190). New York: Academic Press.Google Scholar
  61. Schittkowski (1986).QLD—A FORTRAN code for quadratic programming, User's Guide. Bayreuth, Germany: Universität of Bayreuth, Mathematisches Institut.Google Scholar
  62. Schwartz, G. (1978). Estimating the dimension of a model.Annals of Statistics, 6, 461–464.Google Scholar
  63. Shepard, R. N. (1962). The analysis of proximities: Multidimensional scaling with an unknown distance function, I and II.Psychometrika, 27, 125–140, 219–246.Google Scholar
  64. Shepard, R. N. (1980). Multidimensional scaling, tree-fitting, and clustering.Science, 210, 390–398.Google Scholar
  65. Takane, Y., & Carroll, J. D. (1981). Nonmetric maximum likelihood multidimensional scaling from directional rankings of similarities.Psychometrika, 46, 389–405.Google Scholar
  66. Torgerson, W. W. (1952). Multidimensional scaling: Theory and method.Psychometrika, 17, 401–419.Google Scholar
  67. Tversky, A., & Sattath, S. (1979). Preference trees.Psychological Review, 84, 327–52.Google Scholar
  68. Ward, J. H. (1963). Hierarchical groupings to optimize an objective function.Journal of American Statistical Association, 58, 236–244.Google Scholar
  69. Winsberg, S., & Carroll, J. D. (1989). A quasi-nonmetric method for multidimensional scaling via a restricted case of an extended INDSCAL model. In R. Coppi & S. Bolasco (Eds.),Multiway data analysis (pp. 405–414). Amsterdam: North Holland.Google Scholar
  70. Young, F. W. (1975). Scaling replicated conditional rank-order data.Sociological Methodology, 12, 129–170.Google Scholar
  71. Young, F. W. (1987). Data theory. In R. M. Hamer (Ed.),Multi-dimensional Scaling: History, theory and applications (pp. 43–66). Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  72. Young, F. W., & Torgerson, W. S. (1967). TORSCA: A FORTRAN IV program for Shepard-Kruskal multidimensional scaling analysis.Behavioral Science, 12, 498.Google Scholar
  73. Zhou, J. L., & Tits, A. L. (1993). User's guide for FSQP Version 3.3: A FORTRAN code for solving constrained nonlinear (minimax) optimization problems, generating iterates satisfying all inequality and linear constraints (Tech. Rep.). College Park, MD: University of Maryland, Department of Electrical Engineering.Google Scholar

Copyright information

© The Psychometric Society 1995

Authors and Affiliations

  • Martin R. Young
    • 1
  • Wayne S. DeSarbo
    • 2
  1. 1.Statistics Department, School of Business AdministrationUniversity of MichiganAnn Arbor
  2. 2.Marketing and Statistics Departments School of Business AdministrationUniversity of MichiganUSA

Personalised recommendations