Skip to main content
Log in

Variable selection in clustering

  • Authors Of Articles
  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Standard clustering algorithms can completely fail to identify clear cluster structure if that structure is confined to a subset of the variables. A forward selection procedure for identifying the subset is proposed and studied in the context of complete linkage hierarchical clustering. The basic approach can be applied to other clustering methods, too.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • ART, D., GNANADESIKAN, R., and KETTENRING, J.R. (1982), “Data-Based Metrics for Cluster Analysis,”Utilitas Mathematica, 21A, 75–99.

    Google Scholar 

  • DE SARBO, W. S., CARROLL, J. D., CLARK, L. A., and GREEN, P.E. (1984), “Synthesized Clustering: A Method for Amalgamating Alternative Clustering Bases with Differential Weighting of Variables,”Psychometrika, 49, 57–78.

    Google Scholar 

  • DE SOETE, G. (1986), “Optimal Variable Weighting for Ultrametric and Additive Tree Clustering,”Quality and Quantity, 20, 169–180.

    Google Scholar 

  • DE SOETE, G., DE SARBO, W.S. and CARROLL, J.D. (1985), “Optimal Variable Weighting for Hierarchical Clustering: An Alternating Least-Squares Algorithm,”Journal of Classification, 2, 173–192.

    Google Scholar 

  • FOWLKES, E. B., GNANADESIKAN, R., and KETTENRING, J. R. (1987), “Variable Selection in Clustering and Other Contexts,” inDesign, Data, and Analysis ed. C. L. Mallows, New York: Wiley, pp. 13–34.

    Google Scholar 

  • GNANADESIKAN, R. (1977),Methods for Statistical Data Analysis of Multivariate Observations New York: Wiley.

    Google Scholar 

  • GNANADESIKAN, R., KETTENRING, J. R., and LANDWEHR, J. M. (1977), “Interpreting and Assessing the Results of Cluster Analyses,”Bulletin of the International Statistical Institute, 47, 451–463.

    Google Scholar 

  • HARTIGAN, J. A. (1972), “Direct Clustering of a Data Matrix,”Journal of the American Statistical Association, 67, 123–129.

    Google Scholar 

  • HARTIGAN, J. A. (1975),Clustering Algorithms New York: Wiley.

    Google Scholar 

  • MAC QUEEN, J. (1967), “Some Methods for Classification and Analysis of Multivariate Observations,” inProceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1), eds. L. LeCam and J. Neyman, Berkeley: University of California Press, pp. 355–372.

    Google Scholar 

  • MC KAY, R. J. and CAMPBELL, N. A. (1982), “Variable Selection Techniques in Discriminant Analysis. I. Description,”British Journal of Mathematical and Statistical Psychology, 35, 1–29.

    Google Scholar 

  • MILLIGAN, G. W. and COOPER, M. C. (1988), “A Study of Variable Standardization,”Journal of Classification, 5, to appear.

  • PILLAI, K. C. S. (1955), “Some New Test Criteria in Multivariate Analysis,”Annals of Mathematical Statistics, 26, 117–121.

    Google Scholar 

  • ROY, S. N., GNANADESIKAN, R., and SRIVASTAVA, J. N. (1971),Analysis and Design of Certain Quantitative Multiresponse Experiments Oxford: Pergamon Press.

    Google Scholar 

  • SEBER, G. A. F. (1984),Multivariate Observations New York: Wiley.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fowlkes, E.B., Gnanadesikan, R. & Kettenring, J.R. Variable selection in clustering. Journal of Classification 5, 205–228 (1988). https://doi.org/10.1007/BF01897164

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01897164

Keywords

Navigation