Journal of Classification

, Volume 3, Issue 1, pp 5–48 | Cite as

Metric and Euclidean properties of dissimilarity coefficients

  • J. C. Gower
  • P. Legendre

Abstract

We assemble here properties of certain dissimilarity coefficients and are specially concerned with their metric and Euclidean status. No attempt is made to be exhaustive as far as coefficients are concerned, but certain mathematical results that we have found useful are presented and should help establish similar properties for other coefficients. The response to different types of data is investigated, leading to guidance on the choice of an appropriate coefficient.

Keywords

Choice of coefficient Dissimilarity Distance Euclidean property Metric property Similarity 

Résumé

Ce travail présente quelques propriétés de certains coefficients de ressemblance et en particulier leur capacité de produire des matrices de distance métriques et euclidiennes. Sans prétendre être exhaustifs dans cette revue de coefficients, nous présentons certains résultats mathématiques que nous croyons intéressants et qui pourraient être établis pour d'autres coefficients. Finalement, nous analysons la réponse des mesures de ressemblance face à différents types de données, ce qui permet de formuler des recommandations quant au choix d'un coefficient.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. BAKER, F.B. (1974), “Stability of Two Hierarchical Grouping Techniques. Case 1: Sensitivity to Data Errors,”Journal of the American Statistical Association, 69, 440–445.Google Scholar
  2. BLASHFIELD, R.K. (1976), “Mixture Model Tests of Cluster Analysis: Accuracy of Four Agglomerative Hierarchical Methods,”Psychological Bulletin, 83, 377–388.Google Scholar
  3. BLOOM, S.A. (1981), “Similarity Indices in Community Studies: Potential Pitfalls,”Marine Ecology Progress Series, 5, 125–128.Google Scholar
  4. CAILLIEZ, F. (1983), “The Analytical Solution to the Additive Constant Problem,”Psychometrika, 48, 305–308.Google Scholar
  5. CAILLIEZ, F., and PAGES, J.-P. (1976),Introduction à l'analyse des données, Paris: Société de Mathématiques appliquées et de Sciences humaines.Google Scholar
  6. CHARLTON, J.R.H., and WYNN, H.P. (1985), “Metric Scaling and Infinitely Divisible Distributions: Schoenberg's Theorem,” Personal Communication.Google Scholar
  7. CUNNINGHAM, K.M., and OGILVIE, J.C. (1972), “Evaluation of Hierarchical Grouping Techniques: A Preliminary Study,”Computer Journal, 15, 209–213.Google Scholar
  8. EESTABROOK, G.F., and ROGERS, D.J. (1966), “A General Method of Taxonomic Description for a Computed Similarity Measure,”BioScience, 16, 789–793.Google Scholar
  9. EVERITT, B. (1974),Cluster Analysis, London: Heinemann Educational Books.Google Scholar
  10. FAITH, D.P. (1985), “Distance Methods and the Approximation of Most-Parsimonious Trees,”Systematic Zoology, 34, 312–325.Google Scholar
  11. FISHER, L., and VAN NESS, J.W. (1971), “Admissible Clustering Procedures,”Biometrika, 58, 91–104.Google Scholar
  12. GOWER, J.C. (1971), “A General Coefficient of Similarity and Some of its Properties,”Biometrics, 27, 857–871.Google Scholar
  13. GOWER, J.C. (1982), “Euclidean Distance Geometry,”Mathematical Scientist, 7, 1–14.Google Scholar
  14. GOWER, J.C. (1984a), “Multivariate Analysis: Ordination, Multidimensional Scaling and Allied Topics,” inHandbook of Applicable Mathematics, Vol. VI: Statistics, Part B, Ed. E. Lloyd, Chichester: John Wiley and Sons, 727–781.Google Scholar
  15. GOWER, J.C. (1984b), “Distance Matrices and Their Euclidean Approximation,” inData Analysis and Informatics, 3, Eds. E. Diday, M. Jambu, L. Lebart, J. Pagès and R. Tomassone, Amsterdam: North-Holland, 3–21.Google Scholar
  16. GOWER, J.C. (1985), “Measures of Similarity, Dissimilarity, and Distance,” inEncyclopedia of Statistical Sciences, Vol. 5, Eds. S. Kotz, N.L. Johnson and C.B. Read, New York: John Wiley and Sons, 397–405.Google Scholar
  17. HAJDU, L.J. (1981), “Graphical Comparison of Resemblance Measures in Phytosociology,”Vegetatio, 48, 47–59.Google Scholar
  18. HUBERT, L. (1974), “Approximate Evaluation Techniques for the Single-Link and Complete-Link Hierarchical Clustering Procedures,”Journal of the American Statistical Association, 69, 698–704.Google Scholar
  19. JACCARD, P. (1901), “Etude Comparative de la Distribution Florale dans une Portion des Alpes et des Jura,”Bulletin de la Société vaudoise des Sciences Naturelles, 37, 547–579.Google Scholar
  20. JARDINE, N., and SIBSON, R. (1968), “The Construction of Hierarchic and Non-Hierarchic Classifications,”Computer Journal, 11, 177–184.Google Scholar
  21. KULCZYNSKI, S. (1928), “Die Pflanzenassoziationen der Pieninen,”Bulletin international de l'Académie polonaise des Sciences et des Lettres, Classe des Sciences mathématiques et naturelles, Série B, Supplément II (1927), 57–203.Google Scholar
  22. LEGENDRE, P., and CHODOROWSKI, A. (1977), “A Generalization of Jaccard's Association Coefficient for Q Analysis of Multi-State Ecological Data Matrices,”Ekologia Polska, 25, 297–308.Google Scholar
  23. LEGENDRE, P., DALLOT, S., and LEGENDRE, L. (1985), “Succession of Species Within a Community: Chronological Clustering, with Applications to Marine and Freshwater Zooplankton,”American Naturalist, 125, 257–288.Google Scholar
  24. LEGENDRE, L., and LEGENDRE, P. (1983a),Numerical Ecology, Developments in Environmental Modelling, Vol. 3, Amsterdam: Elsevier Scientific Publishing Company.Google Scholar
  25. LEGENDRE, L. and LEGENDRE, P. (1983b), “Partitioning Ordered Variables into Discrete States for Discriminant Analysis of Ecological Classifications,”Canadian Journal of Zoology, 61, 1002–1010.Google Scholar
  26. LINGOES, J.C. (1971), “Some Boundary Conditions for a Monotone Analysis of Symmetric Matrices,”Psychometrika, 36, 195–203.Google Scholar
  27. MIRSKY, L. (1955).Introduction to Linear Algebra, Oxford: Oxford University Press.Google Scholar
  28. ORLOCI, L. (1978),Multivariate Analysis in Vegetation Research, Second Edition, The Hague: Dr. W. Junk B.V.Google Scholar
  29. RAND, W.M. (1971), “Objective Criteria for the Evaluation of Clustering Methods,”Journal of the American Statistical Association, 66, 846–850.Google Scholar
  30. RENKONEN, O. (1938), “Statistisch-ökologische Untersuchungen über die terrestische Käferwelt der finnischen Bruchmoore,”Annales Zoologici Societatis Zoologicae-Botanicae Fennicae ‘Vanamo’, 6, 1–231.Google Scholar
  31. SCHOENBERG, I.J. (1935), “Remarks to Maurice Fréchet's article ‘Sur la définition axiomatique d'une classe d'espaces vectoriels distanciés applicables vectoriellement sur l'espace de Hilbert’,”Annals of Mathematics, 36, 724–732.Google Scholar
  32. SIBSON, R. (1971), “Some Observations on a Paper by Lance and Williams,”Computer Journal, 14, 156–157.Google Scholar
  33. SIBSON, R. (1979), “Studies in the Robustness of Multidimensional Scaling: Perturbational Analysis of Classical Scaling,”Journal of the Royal Statistical Society, Series B, 41, 217–229.Google Scholar
  34. SPATH, H. (1980),Cluster Analysis Algorithms for Data Reduction and Classification of Objects, translated by Ursula Bull, Chichester: Ellis Horwood Ltd., and New York: John Wiley and Sons.Google Scholar
  35. WILLIAMS, W.T., CLIFFORD, H.T., and LANCE, G.N. (1971a), “Group-size Dependence: A Rationale for Choice Between Numerical Classifications,”Computer Journal, 14, 157–162.Google Scholar
  36. WILLIAMS, W.T., LANCE, G.N., DALE, M.B., and CLIFFORD, H.T. (1971b), “Controversy Concerning the Criteria for Taxonometric Strategies,”Computer Journal, 14, 162–165.Google Scholar
  37. WOLDA, H. (1981), “Similarity Indices, Sample Size and Diversity,”Oecologia (Berl.), 50, 296–302.Google Scholar
  38. ZEGERS, F.E. (1986), “Two Classes of Element-Wise Transformations Preserving the Positive Semi-Definite Nature of Coefficient Matrices,”Journal of Classification, 3, 49–53.Google Scholar

Copyright information

© Springer-Verlag New York Inc. 1986

Authors and Affiliations

  • J. C. Gower
    • 1
  • P. Legendre
    • 2
  1. 1.Statistics DepartmentRothamsted Experimental StationHarpendenUnited Kingdom
  2. 2.Départment de Sciences BiologiquesUniversité de MontréalMontréalCanada

Personalised recommendations