A method for testing the distinctness of clusters: A test of the disjunction of two clusters in Euclidean space as measured by their overlap

Sneath, P. H. A.

doi:10.1007/BF02312508

A method for testing the distinctness of clusters: A test of the disjunction of two clusters in Euclidean space as measured by their overlap

Published: April 1977

Volume 9, pages 123–143, (1977)
Cite this article

Journal of the International Association for Mathematical Geology Aims and scope Submit manuscript

P. H. A. Sneath¹

135 Accesses
80 Citations
2 Altmetric
Explore all metrics

Abstract

A method is described for testing the distinctness of two clusters in Euclidean space. One first calculates the projections, q,of the N₁ and N₂ members of the clusters onto the line joining the cluster centroids. From the distributions of qan index of disjunction, W,is calculated, which corresponds to an index of overlap, V_G.The quantity W√(N₁+N₂)is distributed as noncentral tsubject to assumptions on the multivariate normal distribution of the clusters. This allows a test of whether the observed disjunction is significantly greater than a chosen figure, which is equivalent to testing whether the overlap of the clusters is significantly less than a corresponding value of V_G.Two clusters that appear distinct may be produced simply by the partitioning of a homogeneous swarm into two contiguous regions. Provided that the clusters form a dichotomy in a dendrogram, and that the clustering method yields geometrically convex clusters, a conservative test of this situation can be derived by determining the excess of Wover the value expected for a rectangular distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Determining the Number of Groups in Cluster Analysis Using Classical Indexes and Stability Measures—Comparison of Results

Flexible parametric bootstrap for testing homogeneity against clustering and assessing the number of clusters

Article Open access 11 June 2015

A Heuristic Automatic Clustering Method Based on Hierarchical Clustering

References

Anderson, T. W., 1958, An introduction to multivariate statistical analysis: John Wiley, New York, 374 p.
Google Scholar
Aspin, A. A., 1949, Tables for use in comparisons whose accuracy involves two variables, separately estimated: Biometrika, v. 36, p. 290–293.
Google Scholar
Baker, F. B., and Hubert, L. J., 1975, Measuring the power of hierarchical cluster analysis: J. Amer. Statist. Assoc., v. 70, p. 31–38.
Google Scholar
Borchardt, G. A., Aruscavage, P. J., and Millard, H. T., Jr., 1972, Correlation of the Bishop Ash, a Pleistocene marker bed, using instrumental neutron activation analysis: J. Sediment. Petrol., v. 42, p. 301–306.
Google Scholar
Cochran, W. G., and Cox, G. M., 1957, Experimental designs (2nd ed.): John Wiley, New York, 612 p.
Google Scholar
Day, N. E., 1969, Estimating the components of a mixture of normal distributions: Biometrika, v. 56, p. 463–474.
Google Scholar
Engelman, L., and Hartigan, J. A., 1969, Percentage points of a test for clusters: J. Amer. Statist. Assoc., v. 64, p. 1647–1648.
Google Scholar
Fisher, L., and Van Ness, J. W., 1973, Admissible discriminant analysis: J. Amer. Statist. Assoc., v. 68, p. 603–607.
Google Scholar
Goodall, D. W., 1970, Cluster analysis using similarity and dissimilarity: Biometrie-Praximetrie, v. 11, p. 34–41.
Google Scholar
Gower, J. C., 1966, Some distance properties of latent root and vector methods used in multivariate analysis: Biometrika, v. 53, p. 325–338.
Google Scholar
Gower, J. C., 1971, A general coefficient of similarity and some of its properties: Biometrics, v. 27, p. 857–871.
Google Scholar
Johnson, N. L., and Welch, B. L., 1939, Applications of the non-centralt-distribution: Biometrika, v. 31, p. 362–389.
Google Scholar
Kendall, M. G., and Stuart, A., 1966, The advanced theory of statistics, v. 3: Griffin, London, 552 p.
Google Scholar
Lance, G. N., and Williams, W. T., 1967, A general theory of classificatory sorting strategies, I, Hierarchical systems: Computer Jour., v. 9, p. 373–380.
Google Scholar
Lehmer, A., 1944, Inverse tables of probabilities of errors of the second kind: Ann. Math. Statist., v. 15, p. 388–398.
Google Scholar
Ling, R. F., 1973, A probability theory of cluster analysis: J. Amer. Statist. Assoc., v. 68, p. 159–164.
Google Scholar
MacArthur, R. H., 1972, Geographical ecology: Harper & Row, New York, 269 p.
Google Scholar
Mehta, J. S., and Srinivasan, B., 1970, On the Behrens-Fisher problem: Biometrika, v. 57, p. 649–655.
Google Scholar
Merrington, M., and Pearson, E. S., 1958, An approximation to the distribution of noncentralt: Biometrika, v. 45, p. 484–491.
Google Scholar
Mountford, M. D., 1970, A test of the difference between clusters,in Patil, G. P., Pielou, E. C., and Waters, W. E., (eds.), Statistical ecology, v. 3: Pennsylvania University Press, University Park, Pennsylvania, p. 237–257.
Google Scholar
Orlocci, L., 1967, Data centering: a review and evaluation with reference to component analysis: Syst. Zool., v. 16, p. 208–212.
Google Scholar
Owen, D. B., 1962, Handbook of statistical tables: Addison-Wesley, Reading, Massachusetts, 580 p.
Google Scholar
Sneath, P. H. A., 1972, Computer taxonomy,in Norris, J. R., and Ribbons, D. W., (eds.), Methods in microbiology, v. 7A: Academic Press, London, p. 29–98.
Google Scholar
Sneath, P. H. A., 1974, Test reproducibility in relation to identification: Int. J. Syst. Bacteriol., v. 24, p. 508–523.
Google Scholar
Sneath, P. H. A., and Johnson, R., 1972, The influence on numerical taxonomic similarities of errors in microbiological tests: J. Gen. Microbiol., v. 72, p. 377–392.
Google Scholar
Sneath, P. H. A., and Sokal, R. R., 1973, Numerical taxonomy: W. H. Freeman, San Francisco, 573 p.
Google Scholar
Stevens, M., 1969, Development and use of multi-inoculation test methods for a taxonomy study: J. Med. Lab. Technol., v. 26, p. 253–263.
Google Scholar
Tang, P. C., 1938, The power function of the analysis of variance tests with tables and illustrations of their use: Statist. Res. Mem., v. 2, p. 126–149.
Google Scholar
Turner, M. E., 1969, Credibility and cluster: Ann. New York Acad. Sci., v. 161, p. 680–688.
Google Scholar
Webster, R., 1971, Wilk's criterion: a measure for comparing the value of general purpose soil classifications: J. Soil Sci., v. 22, p. 254–260.
Google Scholar
Welch, B. L., 1947, The generalization of ‘Student's’ problem when several different population variances are involved: Biometrika, v. 34, p. 28–35.
Google Scholar
Welch, B. L., 1949, Further note on Mrs. Aspin's tables and on certain approximations to the tabled function: Biometrika, v. 36, p. 293–296.
Google Scholar
Williams, W. T., Clifford, H. T., and Lance, G. N., 1971, Group-size dependence: a rationale for choice between numerical classifications: Computer J., v. 14, p. 157–162.
Google Scholar
Williams, W. T., and Dale, M. B., 1965, Fundamental problems in numerical taxonomy: Advanc. Bot. Res., v. 2, p. 35–68.
Google Scholar
Wolfe, J. H., 1970, Pattern clustering by multivariate mixture analysis: Multiv. Behav. Res., v. 5, p. 329–350.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Microbiology, University of Leicester, LE1 7RH, Leicester, UK
P. H. A. Sneath

Authors

P. H. A. Sneath
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sneath, P.H.A. A method for testing the distinctness of clusters: A test of the disjunction of two clusters in Euclidean space as measured by their overlap. Mathematical Geology 9, 123–143 (1977). https://doi.org/10.1007/BF02312508

Download citation

Received: 15 October 1976
Issue Date: April 1977
DOI: https://doi.org/10.1007/BF02312508

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A method for testing the distinctness of clusters: A test of the disjunction of two clusters in Euclidean space as measured by their overlap

Abstract

Access this article

Similar content being viewed by others

Determining the Number of Groups in Cluster Analysis Using Classical Indexes and Stability Measures—Comparison of Results

Flexible parametric bootstrap for testing homogeneity against clustering and assessing the number of clusters

A Heuristic Automatic Clustering Method Based on Hierarchical Clustering

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Navigation

A method for testing the distinctness of clusters: A test of the disjunction of two clusters in Euclidean space as measured by their overlap

Abstract

Access this article

Similar content being viewed by others

Determining the Number of Groups in Cluster Analysis Using Classical Indexes and Stability Measures—Comparison of Results

Flexible parametric bootstrap for testing homogeneity against clustering and assessing the number of clusters

A Heuristic Automatic Clustering Method Based on Hierarchical Clustering

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation