Advertisement

Journal of Classification

, Volume 5, Issue 1, pp 39–51 | Cite as

Clustering the rows and columns of a contingency table

  • Michael J. Greenacre
Article

Abstract

A number of ways of investigating heterogeneity in a two-way contingency table are reviewed. In particular, we consider chi-square decompositions of the Pearson chi-square statistic with respect to the nodes of a hierarchical clustering of the rows and/or the columns of the table. A cut-off point which indicates “significant clustering” may be defined on the binary trees associated with the respective row and column cluster analyses. This approach provides a simple graphical procedure which is useful in interpreting a significant chi-square statistic of a contingency table.

Keywords

Chi-square statistic Cluster analysis Contingency tables Correspondence analysis Multiple comparisons Wishart distribution 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. BENZECRI, J.-P. (1973),L'Analyse des Données, Tome (Vol.) 1 — La Taxinomie, Tome 2 — L'Analyse des Correspondances, Paris: Dunod.Google Scholar
  2. BENZECRI, J.-P., and CAZES, P. (1978), “Probleme sur la classification,”Cahiers de L'Analyse des Données, 3, 95–101.Google Scholar
  3. CLEVELAND, W.S., and RELLES, D.A. (1975), “Clustering by Identification with Special Application to Two-way Tables of Counts,”Journal of the American Statistical Association, 70, 626–630.Google Scholar
  4. EVERITT, B.,Cluster Analysis, London: Heinemann.Google Scholar
  5. GABRIEL, K.R. (1966), “Simultaneous Test Procedures for Multiple Comparisons on Categorical Data,”Journal of the American Statistical Association, 61, 1081–1096.Google Scholar
  6. GILULA, Z. (1986), “Grouping and Association in Contingency Tables: An Exploratory Canonical Correlation Approach,”Journal of the American Statistical Association, 81, 773–779.Google Scholar
  7. GILULA, Z., and HABERMAN, S.J. (1986), “Canonical Analysis of Contingency Tables by Maximum Likelihood,”Journal of the American Statistical Association, 81, 780–788.Google Scholar
  8. GILULA, Z. and KRIEGER, A.M. (1983), “The Decomposability and Monotonicity of Pearson's Chi-Square for Collapsed Contingency Tables with Applications,”Journal of the American Statistical Association, 78, 176–180.Google Scholar
  9. GOLD, R.Z. (1963), “Tests Auxilliary to x2 Tests in a Markov Chain,”Annals of Mathematical Statistics, 34, 56–74.Google Scholar
  10. GOODMAN, L.A. (1964), “Simultaneous Confidence Intervals for Contrasts Among Multinomial Populations,”Annals of Mathematical Statistics, 35, 716–725.Google Scholar
  11. GOODMAN, L.A. (1965), “On Simultaneous Confidence Intervals for Multinomial Proportions,”Technometrics, 7, 247–254.Google Scholar
  12. GOODMAN, L.A. (1985), “The Analysis of Cross-Classified Data Having Ordered and/or Unordered Categories: Association Models, Correlation Models, and Asymmetry Models for Contingency Tables with or without Missing Entries,”Annals of Statistics, 13, 10–69.Google Scholar
  13. GOVAERT G. (1984), “Classification Simultanée de Tableaux Binaires,” inData Analysis and Informatics 3, eds. E. Diday, M. Jambu, L. Lebart, J. Pages, and R. Tomassone, Amsterdam: North Holland, 223–236.Google Scholar
  14. GREENACRE, M.J. (1984),Theory and Applications of Correspondence Analysis, London: Academic Press.Google Scholar
  15. GUTTMAN, L. (1971), “Measurement as Structural Theory,”Psychometrika, 36, 329–347.Google Scholar
  16. HIROTSU, C. (1983), “Defining the Pattern of Association in Two-way Contingency Tables,”Biometrika, 70, 579–589.Google Scholar
  17. JAMBU, M. (1978),Classification Automatique pour L'Analyse des Données, 1 — Méthodes et Algorithmes, Paris: Dunod.Google Scholar
  18. JAMBU, M., and LEBEAUX, M.O. (1983),Cluster Analysis and Data Analysis, Amsterdam: North Holland.Google Scholar
  19. LANCE, G.N., and WILLIAMS, W.T. (1967), “A General Theory of Classificatory Sorting Strategies. 1. Hierarchical Systems,”Computer Journal, 9, 373–380.Google Scholar
  20. LEBART, L. (1975),Validité des Résultats en Analyse des Données, Paris: CREDOC-DGRST.Google Scholar
  21. LEBART, L., MORINEAU, A., and WARWICK, K. (1984),Multivariate Descriptive Statistical Analysis, New York: Wiley.Google Scholar
  22. O'NEILL, M.E. (1981), “A Note on the Canonical Correlations from Contingency Tables,”Australian Journal of Statistics, 23, 58–66.Google Scholar
  23. PEARSON, E.S., and HARTLEY, H.O. (1972),Biometrika Tables for Statisticians, Volume 2, Cambridge, England: Cambridge University Press.Google Scholar
  24. QUESENBERRY, C.P., and HURST, D.C. (1964), “Large Sample Simultaneous Confidence Intervals for Multinomial Proportions,”Technometrics, 6, 191–195.Google Scholar
  25. SNEE, R.D. (1974), “Graphical Display of Two-way Contingency Tables,”American Statistician, 28, 9–12.Google Scholar
  26. THARU, J., and WILLIAMS, W.T. (1966), “Concentration of Entries in Binary Arrays,”Nature, 210, 549.Google Scholar
  27. WARD, J.H. (1963), “Hierarchical Grouping to Optimize an Objective Function,”Journal of the American Statistical Association, 58, 236–244.Google Scholar
  28. WISHART, D. (1969), “An Algorithm for Hierarchical Classifications,”Biometrics, 25, 165–170.Google Scholar

Copyright information

© Springer-Verlag New York Inc 1988

Authors and Affiliations

  • Michael J. Greenacre
    • 1
  1. 1.Department of StatisticsUniversity of South AfricaPretoriaSouth Africa

Personalised recommendations