The Methods and Problems of Cluster Analysis

  • Roger K. Blashfield
  • Mark S. Aldenderfer

Abstract

Cluster analysis methods have a long history. The earliest known procedures were suggested by anthropologists (Czekanowski, 1911; Driver and Kroeber, 1932). Later, these ideas were picked up in psychology. For instance, Zubin (1938) proposed a rather simple method for sorting a correlation matrix which would yield clusters. About the same time, Stephenson (1936) suggested the use of inverted factor analysis to find clusters of people.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderberg, M. Cluster analysis for applications. New York: Academic Press, 1973.Google Scholar
  2. Anderson, T. W., Das Gupta, S., and Styan, G. P. H. A bibliography of multivariate statistical analysis. Edinburgh: Oliver & Boyd, 1972.Google Scholar
  3. Arabie, P., and Carroll, J. D. MAPCLUS: A mathematical programming approach to fitting the ADCLUS model. Psychometrika,1980, 45, 211–235.Google Scholar
  4. Bailey, T. A., and Dubes, R. Cluster validity profiles. Pattern Recognition 1982, 15, 61–83.Google Scholar
  5. Bartko, J. J. On various intraclass correlation reliability coefficients. Psychological Bulletin, 1976, 83, 762–765.Google Scholar
  6. Bartko, J. J., Strauss, J. S., and Carpenter, W. S. An evaluation of taxometric techniques for psychiatric data. Classification Society Bulletin,1971, 2, 2–28.Google Scholar
  7. Bass, B. M. Iterative inverse factor analysis—A rapid method for clustering persons. Psychometrika,1957, 22, 105–107.Google Scholar
  8. Bayne, R., Beauchamp, J., Begovich, C., and Kane, V. Monte Carlo comparisons of selected clustering procedures. Pattern Recognition, 1980, 12, 51–62.Google Scholar
  9. Bellman, R. E. Kalaha, R., and Zadeh, L. Abstraction and pattern classification. Journal of Mathematical Analysis and Applications,1966, 13, 1–7.Google Scholar
  10. Bezdek, J. C. Cluster validity with fuzzy sets. Journal of Cybernetics,1974, 3, 57–73.Google Scholar
  11. Bezdek, J. C. Mathematical models for taxonomy. Proceedings of the &h International Conference on Numerical Taxonomy,1975.Google Scholar
  12. Boake, C. Recovery of simulated MMPI mixtures by seven methods of cluster analysis. Paper presented to annual meeting of American Psychological Association, Anaheim, Calif., 1983.Google Scholar
  13. Bonacich, P., and Domhoff, G. W. Latent classes and group membership. Social Networks, 1981, 3, 175–196.Google Scholar
  14. Borgen, F. H., and Weiss, D. J. Cluster analysis and counseling research. Journal of Counseling Psychotherapy,1971, 18, 583–591.Google Scholar
  15. Burt, C. Correlations between persons. British Journal of Psychology,1937, 28,167–185.Google Scholar
  16. Carlson, K. A. Classes of adult offenders: A multivariate study. Journal of Abnormal Psychology, 1972, 79, 84–93.PubMedGoogle Scholar
  17. Carmichael, J. W., George, J. A., and Julius, R. S. Finding natural clusters. Systematic Zoology, 1968, 17, 144–150.Google Scholar
  18. Carroll, J. D., and Arabie, P. INDCLUS: An individual differences generalization of the ADCLUS model and the MARCLUS algorithm. Psychometrika, 1983, 48, 157–169.Google Scholar
  19. Carroll, R. M., and Field, J. A comparison of the classific°qon accuracy of profile similarity measures. Multivariate Behavioral Research, 1974, 9, 373–380.Google Scholar
  20. Cattell, R. B. A note on correlation clusters and cluster search methods. Psychometrika, 1944, 9, 169–184.Google Scholar
  21. Cattell, R. B. rp and other coefficients of pattern similarity. Psychometrika, 1949, 14, 279–298.Google Scholar
  22. Cattell, R. B. Factor analysis. New York: Harper, 1952.Google Scholar
  23. Cattell, R. B. A universal index of psychological factors. Psychologia,1957, 1,74–85.Google Scholar
  24. Cattell, R. B. The scientific analysis of personality. Chicago: Aldine, 1965.Google Scholar
  25. Cattell, R. B. The scientific use of factor analysis. New York: Academic Press, 1978.Google Scholar
  26. Cattell, R. B., Coulter, M. A., and Tsujioka, B. The taxonomic recognition of types and functional emergents. In R. B. Cattell (Ed.), Handbook of Multivariate Experimental Psychology. Chicago, Rand McNally, 1966.Google Scholar
  27. Clifford, H., and Stephenson, W. An introduction to numerical taxonomy. New York: Academic Press, 1975.Google Scholar
  28. Cohen, J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 1960, 20, 37–46.Google Scholar
  29. Cole, A. J. Numerical taxonomy. New York: Academic Press, 1969.Google Scholar
  30. Cormack, R. M. A review of classification. Journal of the Royal Statistical Society (Series A), 134,321–367.Google Scholar
  31. Cornell, J. A. Experiments with mixtures. Designs, models,and the analysis of mixture data. New York: Wiley, 1981.Google Scholar
  32. Cronbach, J. J., and Gleser, G. C. Assessing similarity between profiles. Psychological Bulletin, 1953, 50, 456–473.PubMedGoogle Scholar
  33. Czekanowski, J. Objectiv kriterien in der ethnologie. Korrespondenzblatt der Deutschen Gesselschaft fur Anthropologie,Ethnologie, und Urgeschichte, 1911, 47, 1–5.Google Scholar
  34. Davies, D. L., and Bouldin, D. W. A cluster separation measure. IEEE Transactions in Pattern Analysis and Machine Intelligence,1979, PAMI-1,224–227.Google Scholar
  35. Delattre, M., and Hansen, P. Bicriterion cluster analysis. IEEE Transactions in Pattern Analysis, 1980, 2, 277–291.Google Scholar
  36. Driver, H. E., and Kroeber, A. L. Quantitative expression of cultural relationships. University of California Publications in Archaeology and Ethnology, 1932, 31, 211–216.Google Scholar
  37. Dubes, R., and Jain, A. K. Clustering methodologies in exploratory data analysis. Advances in Computers, 1980, 19, 113–228.Google Scholar
  38. Duda, R. O., and Hart, P. E. Pattern classification and scene analysis. New York: Wiley, 1973.Google Scholar
  39. Duran, B. S., and Odell, P. L. Cluster analysis: A survey. Berlin: Springer-Verlag, 1974.Google Scholar
  40. Edelbrock, C., and McLaughlin, B. Hierarchical cluster analysis using intraclass correlations: A mixture model study. Multivariate Behavioral Research,1980, 15,299–318.Google Scholar
  41. Edelbrock, C., and Reed, M. Inverse factor analysis: An evaluation using benchmark data sets. Unpublished, 1982.Google Scholar
  42. Edwards, A. W. F., and Cavalli-Sforza, L. L. A method for cluster analysis. Biometrics, 1965, 21, 362–375.PubMedGoogle Scholar
  43. Everitt, B. S. Cluster analysis. London: Heinemann, 1974.Google Scholar
  44. Everitt, B. S. Unresolved problems in cluster analysis. Biometrics, 1979, 35, 169–181.Google Scholar
  45. Everitt, B. S. Cluster analysis (2nd ed.). New York: Wiley, 1980.Google Scholar
  46. Everitt, B. S., Gourlay, A. J., and Kendell, R. E. An attempt at validation of traditional psychiatric syndromes by cluster analysis. British Journal of Psychiatry,1971, 119,399–412.PubMedGoogle Scholar
  47. Fisher, L., and van Ness, J. W. Admissible clustering procedures. Biometrika, 1971, 58, 91–104.Google Scholar
  48. Fisher, R. A. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 1936, 179–188.Google Scholar
  49. Fisher, W. D. Clustering and aggregation in economics. Baltimore: Johns Hopkins University Press, 1968.Google Scholar
  50. Fleiss, J., Lawlor, W., Platman, S., and Fieve, R. On the use of inverted factor analysis for generating typologies. Journal of Abnormal Psychology,1971, 77, 127–132.Google Scholar
  51. Florek, K., Lukaszewica, J., Perkal, J., Steinhaus, H., and Zubrzycki, S. Sur la liason et la devision des points d’un ensemble fini. Colloquia Mathmatica,1951, 2,282–285.Google Scholar
  52. Friedman, H. P., and Rubin, J. On some criteria for grouping data. Journal of the American Statistical Association,1967, 62,1159–1178.Google Scholar
  53. Golden, R. R., and Meehl, P. E. Detection of the schizoid taxon with MMPI indicators. Journal of Abnormal Psychology,1979, 88,217–233.PubMedGoogle Scholar
  54. Golden, R. R., and Meehl, P. E. Detection of biological sex: An empirical test of cluster methods. Multivariate Behavioral Research,1981, 16,475–496.Google Scholar
  55. Goldstein, S. G., and Linden, J. D. A comparison of multivariate grouping techniques commonly used with profile data. Multivariate Behavioral Research,1969, 4,103–114.Google Scholar
  56. Gordon, A. D., and Henderson, J. T. An algorithm for Euclidean sum of squares classification. Biometrics, 1977, 33, 355–362.Google Scholar
  57. Cowda, K. C., and Krishna, G. Agglomerative clustering using the concept of mutual nearest neighborhood. Pattern Recognition,1978, 10,105–112.Google Scholar
  58. Gower, J. C. A comparison of some methods of cluster analysis. Biometrics, 1967, 23, 623–637.PubMedGoogle Scholar
  59. Gower, J. C. Goodness-of-fit criteria for classification and other patterned structures. Proceedings of the &h International Conference on Numerical Taxonomy, 1975, pp. 38–62.Google Scholar
  60. Gower, J. C., and Ross, G. J. S. Minimum spanning trees and single-linkage cluster analysis. Applied Statistics,1969, 18,54–64.Google Scholar
  61. Gridgeman, N. T. A comparison of two methods of analysis of mixtures of normal distributions. Technometrics,1970, 12,823–833.Google Scholar
  62. Gross, A. L. A Monte Carlo study of the accuracy of the hierarchical grouping procedure. Multivariate Behavioral Research,1972, 7, 379–389.Google Scholar
  63. Guertin, W. H. The search for recurring patterns among individual profiles. Educational and Psychological Measurement,1966, 26,151–165.Google Scholar
  64. Hamer, R., and Cunningham, J. Cluster analyzing profile data confounded with interrater differences: Comparison of profile association measures. Applied Psychological Measurement, 1981, 5, 63–72.Google Scholar
  65. Hansen, P., and Delattre, M. Complete-link cluster analysis by graph coloring. Journal of the American Statistical Association, 1978, 73, 397–403.Google Scholar
  66. Hartigan, J. A. Representation of similarity matrices by trees. Journal of the American Statistical Association,1967, 62,1140–1158.Google Scholar
  67. Hartigan, J. A. Clustering algorithms. New York: Wiley, 1975.Google Scholar
  68. Hartigan, J. A. Distributional problems in clustering. In J. Van Ryzin (Ed.), Classification and clustering. New York: Academic Press, 1977.Google Scholar
  69. Hartigan, J. A. Consistency of single linkage for high-density clusters. Journal of the American Statistical Association,1981, 76, 388–394.Google Scholar
  70. Helmstadter, G. C. An empirical comparison of methods for estimating profile similarity. Educational and Psychological Measurement, 1957, 17, 71–82.Google Scholar
  71. Holgerson, M. The limited value of the cophenetic correlation as a clustering criterion. Pattern Recognition,1978, 10,287–295.Google Scholar
  72. Horst, P. Factor analysis of data matrices. New York: Holt, Rinehart & Winston, 1965.Google Scholar
  73. Hosmer, D. W. A comparison of interative maximum likelihood estimates of the parametrics of a mixture of two normal distributions under three different types of samples. Biometrics,1973, 29,761–770.Google Scholar
  74. Hubert, L. J. Some applications of graph theory to clustering. Psychometrika, 1974b, 39, 283–309.Google Scholar
  75. Hudson, H. C. (Ed.) Classifying social data: New applications of analytic methods for social science research. San Francisco: Jossey—Bass, 1982.Google Scholar
  76. Jambu, M., and Lebeaux, M. O. Cluster analysis and data analysis. Amsterdam: North-Holland, 1983.Google Scholar
  77. Jardine, N., and Sibson, R. The construction of hierarchic and nonhierarchic classification. Computer Journal, 1968, 11, 117–184.Google Scholar
  78. Jardine, N., and Sibson, R. Mathematical taxonomy. New York: Wiley, 1971.Google Scholar
  79. Johnson, S. Hierarchical clustering schemes. Psychometrika, 1967, 38, 241–254.Google Scholar
  80. Krus, D. J. Logical basis of dimensionality. Applied Psychological Measurement, 1978, 2, 321–329.Google Scholar
  81. Kuiper, F., and Fisher, L. A Monte Carlo comparison of six clustering procedures. Biometrics,1975, 31,777–783.Google Scholar
  82. Lance, G., and Williams, W. A general theory of classificatory sorting strategies. Computer Journal, 1967, 9,373–380.Google Scholar
  83. Ling, R. F. A probability theory of cluster analysis. Journal of the American Statistical Association, 1973a, 68,159–169.Google Scholar
  84. Ling, R. F. The expected number of components in random linear graphs. Annals of Probability, 1973b, 1, 876–881.Google Scholar
  85. Lorr, M. (Ed.) Explorations in typing psychotics. Elmsford, N.Y.: Pergamon Press, 1966.Google Scholar
  86. Lorr, M. Cluster analysis for social scientists. San Francisco: Jossey—Bass, 1983.Google Scholar
  87. McNaughton-Smith, P. Some statistical and other numerical techniques for classifying individuals. London: Her Majesty’s Stationery Office, 1965.Google Scholar
  88. McQuitty, L. L. Elementary linkage analysis for isolating orthogonal and oblique types and typal relevancies. Educational and Psychological Measurement, 1957, 17, 207–22.Google Scholar
  89. McQuitty, L. L. Capabilities and improvements in linkage analysis as a clustering method. Educational and Psychological Measurements,1964, 24,441–456.Google Scholar
  90. Mahalanobis, P. C. On the generalized distance in statistics. Proceedings of the National Institute of Sciences of India,1936, 2,49–55.Google Scholar
  91. Matthews, A. Standardization of measures prior to clustering. Biometrics, 1980, 35, 892–897.Google Scholar
  92. May, K. O. The growth and quality of the mathematical literature. Isis, 1969, 59, 363–371.Google Scholar
  93. Meehl, P. E. Psychodiagnosis: Selected papers. Minneapolis: University of Minnesota Press, 1973.Google Scholar
  94. Meehl, P. E. A funny thing happened to us on the way to latent entities. Journal of Personality Assessment,1979, 43, 563–581.Google Scholar
  95. Meehl, P. E., and Golden, R. R. Taxonmetric methods. In P. C. Kendal and J. N. Butcher (Eds.), Handbook of research methods in clinical psychology. New York: Wiley, 1982.Google Scholar
  96. Mezzich, J. E. Evaluating clustering methods for psychiatric diagnosis. Biological Psychiatry, 1978, 13, 265–281.PubMedGoogle Scholar
  97. Mezzich, J. E. Comparing cluster analysis methods. In H. C. Hudson (Ed.), Classifying social data: New applications of analytic methods for social science research. San Frnacisco: Jossey—Bass, 1982.Google Scholar
  98. Mezzich, J. E., and Solomon, H. Taxonomy and behavioral science. New York: Academic Press, 1980.Google Scholar
  99. Milligan, G. W. An examination of the effect of six types of error perturbation of fifteen clustering algorithms. Psychometrika,1980, 45,325–342.Google Scholar
  100. Milligan, G. W. A review of Monte Carlo tests of cluster analysis. Multivariate Behavioral Research,1981, 16,379–407.Google Scholar
  101. Milligan, G. W., and Cooper, M. C. An examination of procedures for determining the number of clusters in a data set. Unpublished report, 1983.Google Scholar
  102. Mojena, R. Hierarchical grouping methods and stopping rules—An evaluation. Computer Journal, 1977, 20, 359–363.Google Scholar
  103. Nowakowska, M. Epidemical spread of scientific objects: An empirical approach to some problems of meta-science. Theory and Decision, 1973, 3, 262–297.Google Scholar
  104. Nunnally, J. C. The analysis of profile data. Psychological Bulletin,1962, 59,311–319.PubMedGoogle Scholar
  105. Overall, J. and Klett, C. Applied multivariate analysis. New York: McGraw–Hill, 1972.Google Scholar
  106. Paykel, E. S. Classification of depressed patients: A cluster analysis derived grouping. British Journal of Psychiatry, 1971, 118, 275–288.PubMedGoogle Scholar
  107. Rand, W. M. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association,1971, 66,846–850.Google Scholar
  108. Rohlf, F. J. Single-link clustering algorithms. In P. R. Krishnaiah and L. N. Kanal (Eds.), Handbook of Statistics. Vol. 2. Amsterdam: North-Holland, 1982.Google Scholar
  109. Rucci, A. J., and Tweney, R. D. Analysis of variance—the “second discipline” of scientific psychology: A historical account. Psychological Bulletin, 1980, 87,166–184.Google Scholar
  110. Scoltock, J. A survey of the literature of cluster analysis. Computer Journal, 1982, 25, 130–134.Google Scholar
  111. Scolve, S. Population mixture models and clustering algorithms. Communications in Statistical Theories and Methods,1977, A6, 417–434.Google Scholar
  112. Shepard, R. N., and Arabie, P. Additive clustering: Representation of similarities as combination of discrete overlapping properties. Psychological Review, 1979, 86, 87–123.Google Scholar
  113. Skinner, H. A. Differentiating the contribution of elevation, scatter, and shape in profile similarity. Educational and Psychological Measurement, 1978, 38, 297–308.Google Scholar
  114. Skinner, H. A. Dimensions and clusters: A hybrid approach to classification. Applied Psychological Measurement,1979, 3,327–341.Google Scholar
  115. Skinner, H. A., and Lei, H. Model profile analysis: A computer program for classification research. Educational and Psychological Measurement, 1980, 40, 769–772.Google Scholar
  116. Skinner, H. A., Jackson, D. N., and Hoffman, H. Alcoholic personality types: Identification and correlates. Journal of Abnormal Psychology, 1974, 83, 658–666.PubMedGoogle Scholar
  117. Sneath, P. H. A. The application of computers to taxonomy. Journal of General Microbiology, 1957, 17, 201–226.PubMedGoogle Scholar
  118. Sneath, P. H. A. A method for testing the distinctiveness of clusters: A test for the disjunction of two clusters in Euclidean space as measured by their overlap. Mathematical Geology, 1977, 9, 123–143.Google Scholar
  119. Sneath, P., and Sokal, R. Numerical taxonomy. San Francisco: Freeman, 1973.Google Scholar
  120. Sokal, R., and Michener, C. D. A statistical method for evaluating systematic relationships. University of Kansas Scientific Bulletin,1958, 38, 1409–1438.Google Scholar
  121. Sokal, R. R., and Rohlf, F. J. The comparison of dendrograms by objective methods. Taxon,1962, 11, 33–40.Google Scholar
  122. Sokal, R. R., and Rohlf, F. J. An experiment in taxonomic judgement. Systematic Botany, 1980, 5, 341–365.Google Scholar
  123. Sokal, R. R., and Sneath, P. Principles of numerical taxonomy. San Francisco: Freeman, 1963.Google Scholar
  124. Spath, H. Cluster analysis algorithms. New York: Wiley, 1980.Google Scholar
  125. Stephenson, W. Introduction of inverted factor analysis with some applications to studies in orexia. Journal of Educational Psychology,1936, 5, 353–367.Google Scholar
  126. Tryon, R. Cluster analysis. New York: McGraw–Hill, 1939.Google Scholar
  127. Tryon, R. C. Identification of social areas by cluster analysis. Berkeley: University of California Press, 1955.Google Scholar
  128. Tryon, R. C., and Bailey, D. E. Cluster analysis. New York: McGraw–Hill, 1970.Google Scholar
  129. Tversky, A. Features of similarity. Psychological Review,1977, 84, 327–352.Google Scholar
  130. Van Ryzin, J. (Ed.) Classification and clustering. New York: Academic Press, 1977.Google Scholar
  131. Ward, J. H. Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association,1963, 58, 236–244.Google Scholar
  132. Williams, W. T., Lance, G. N., Dale, M. B., and Clifford, H. T. Controversy concerning the criteria for taxometric strategies. Computer Journal, 1971, 14, 162–165.Google Scholar
  133. Wishart, D. Mode analysis: A generalization of nearest neighbor which reduces chaining effects. In A. Cole (Ed.), Numerical taxonomy. New York: Academic Press, pp. 282–311.Google Scholar
  134. Wolfe, J. H. Pattern clustering by multivariate mixture analysis. Multivariate Behavioral Research, 1970, 5, 329–350.Google Scholar
  135. Wolfe, J. H. A Monte Carlo study of the sampling distribution of the likelihood ratio for mixtures of multinormal distributions. Naval Personnel and Training Research Laboratory Technical Bulletin STB 72–2, San Diego, Calif., 1971.Google Scholar
  136. Wong, M. A., and Lane, T. A Rth nearest neighbor clustering procedure. Unpublished manuscript, 1981.Google Scholar
  137. Woodbury, M. A., and Clive, J. Clinical pure types as a fuzzy partition. Journal of Cybernetics,1974, 4, 111–121.Google Scholar
  138. Yahil, A., and Brown, M. B. On separating clusters from background. Technometrics, 1976, 18, 55–58.Google Scholar
  139. Yau, S. S., and Chang, S. C. A direct method for cluster analysis. Pattern Recognition, 1975, 7, 215–224.Google Scholar
  140. Zadeh, L. A. Similarity relations and fuzzy orderings. Information Sciences 1971, 3, 177–200.Google Scholar
  141. Zahn, C. T. Graph-theoretic methods for detecting and describing Gestalt clusters. IEEE Transactions in Computers, 1971, C-20, 68–86.Google Scholar
  142. Zubin, J. A technique for measuring likemindedness. Journal of Abnormal Psychology,1938, 33, 508–516.Google Scholar

Copyright information

© Plenum Press, New York 1988

Authors and Affiliations

  • Roger K. Blashfield
    • 1
  • Mark S. Aldenderfer
    • 2
  1. 1.Department of PsychiatryUniversity of FloridaGainesvilleUSA
  2. 2.Department of AnthropologyNorthwestern UniversityEvanstonUSA

Personalised recommendations