Journal of Statistical Theory and Practice

, Volume 5, Issue 4, pp 649–658 | Cite as

Measuring and Analyzing the Within Group Homogeneity of Multi-category Variables

  • David G. Steel
  • Mark D. Tranmer


Many variables have within group homogeneity (similarity of values for the individual units that comprise the groups). Measures of within group homogeneity are useful for the sample design and statistical analysis of datasets for populations that contain groups, such as individuals in geographical areas. Homogeneity measures can easily be defined for continuous or dichotomous variables. Here, we propose a homogeneity measure for a multi-category variable, and show how this measure can be calculated without access to individual level data. We apply the measure to data from the UK census, and show how this measure can be related to the homogeneity of particular linear combinations of the categories, called Canonical Grouping Variables (CGVs), and explain how these are interpreted.

AMS Subject Classification



Groups Clustering Homogeneity Intra-class correlation Categorical variables Canonical grouping variables Aggregate data Census area data 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Goldstein, H., 2003. Multilevel Statistical Models, Third edition. Edward Arnold.Google Scholar
  2. Holt, D., Steel, D.G., Tranmer, M., 1996. Area homogeneity and the modifiable areal unit problem. Geographical Systems, 3, 181–200.Google Scholar
  3. Kish, L., 1965. Survey Sampling Techniques. Wiley, New York.zbMATHGoogle Scholar
  4. Martin, D., 2000. Towards the Geographies of the 2001 UK Census of Population. Transactions of the Institute of British Geographers, 25, 321–332.Google Scholar
  5. Martin, D., 2002. AZM (Automated Zone Matching) Software. URL: Scholar
  6. Martin, D., Nolan, A., Tranmer, M., 2001. The application of zone-design methodology in the 2001 UK Census. Environment and Planning, A, 33(11), 1949–1962.CrossRefGoogle Scholar
  7. Openshaw, S., Taylor, P., 1983. The Modifiable Areal Unit Problem. Geo Books, Norwich.Google Scholar
  8. Seber, G.A.F., 1984. Multivariate Observations. Wiley, New York.CrossRefGoogle Scholar
  9. Skinner, C.J., 1989. Introduction to Part A. Wiley, Chichester.Google Scholar
  10. Steel, D.G., 1985. Statistical Analysis of Populations with Group Structure. Ph.D. thesis, University of Southampton.Google Scholar
  11. Steel, D.G., Holt, D., Tranmer, M., 1996. Making Unit-Level Inferences from Aggregated Data. Survey Methodology, 22, 3–15.Google Scholar
  12. Tranmer, M., Steel, D.G., 1998. Using census data to investigate the causes of the ecological fallacy. Environment and Planning, A, 30, 817–831.CrossRefGoogle Scholar

Copyright information

© Grace Scientific Publishing 2011

Authors and Affiliations

  • David G. Steel
    • 1
  • Mark D. Tranmer
    • 2
  1. 1.Centre for Statistical and Survey MethodologyUniversity of WollongongAustralia
  2. 2.Centre for Census and Survey Research, School of Social SciencesUniversity of ManchesterUK

Personalised recommendations