Environmental and Ecological Statistics

, Volume 15, Issue 4, pp 385–401 | Cite as

A spatial clustering perspective on autocorrelation and regionalization

  • Ferenc Csillag
  • Sándor Kabos
  • Tarmo K. Remmel


We revisit one of the classical problems in geography and cartography where multiple observations on a lattice (N) need to be grouped into many fewer regions (G), especially when this number of desired regions is unknown a priori. Since an optimization through all possible aggregations is not feasible, a hierarchical classification scheme is proposed with an objective function sensitive to spatial pattern. The objective function to be minimized during the assignment of observations to regions (classification) consists of two terms: the first characterizes accuracy and the second, model complexity. For the latter, we introduce a spatial measure that characterizes the number of homogeneous patches rather than the usual number of classes. A simulation study shows that such a classification procedure is less sensitive to random and spatially correlated error (noise) than non-spatial classification. We also show that for conditional autoregressive error (noise) fields the optimal partitioning is the one that has the highest within-units generalized Moran coefficient. The classifier is implemented in ArcView to demonstrate both a socio-economic and an environmental application to illustrate some potential applications.


Ward clustering Number of patches Boundaries Information criterion CAR AIC Autocorrelation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Akaike H (1973) Information theory and an extension of maximum likelihood principle. In: Proceedings of the 2nd international symposium on information theory. Akademiai Kiado, Budapest, pp 267–281Google Scholar
  2. Bailey TC, Gatrell AC (1995) Interactive spatial data analysis. Longman Scientific Publishing, HarlowGoogle Scholar
  3. Besag J (1986) On the statistical-analysis of dirty pictures. J Roy Stat Soc B Met 48: 259–302Google Scholar
  4. Cihlar J, Ly H, Xiao QH (1996) Land cover classification with AVHRR multichannel composites in northern environments. Remote Sens Environ 58: 36–51CrossRefGoogle Scholar
  5. Cressie N (1993) Statistics for spatial data. John Wiley & Sons, New YorkGoogle Scholar
  6. Cromley RG (1996) A comparison of optimal classification strategies for choropleth displays of spatially aggregated data. Int J Geogr Inf Syst 10: 405–424CrossRefGoogle Scholar
  7. Csillag F (1997) Quadtrees: hierarchical multiresolution data structures for analysis of digital images. In: Quattrochi D, Goodchild M(eds) Scale in remote sensing and GIS. Lewis Publishers, Boca Raton, pp 247–271Google Scholar
  8. Csillag F, Kabos S (1997) How many regions? Toward a definition of regionalization efficiency. In: Proceedings of Auto-Carto 13, vol 5. American Congress Surveying & Mapping, American Society for Photogrammetry & Remote Sensing Annual Convention, Seattle, Washington, pp 96–105Google Scholar
  9. Cuzick J, Edwards R (1990) Spatial clustering for inhomogeneous populations. J Roy Stat Soc B Met 52: 73–104Google Scholar
  10. Driscoll CT, Van Dreason R (1993) Seasonal and long-term temporal patterns in the chemistry of Adirondack lakes. Water Air Soil Poll 67: 319–344CrossRefGoogle Scholar
  11. Eagles J (1995) Spatial contextual research in geography. Taylor & Francis, LondonGoogle Scholar
  12. Fisher D (1996) Iterative optimization and simplification of hierarchical clusterings. J Artif Intell Res 4: 147–179Google Scholar
  13. Fukada Y (1980) Spatial clustering procedures for region analysis. Pattern Recogn 12: 395–403CrossRefGoogle Scholar
  14. Guo D, Peuquet DJ, Gahegan M (2003) ICEAGE: interactive clustering and exploration of large and high-dimensional geodata. GeoInformatica 7: 229–253CrossRefGoogle Scholar
  15. Haining R (1990) Spatial data analysis in the social and environmental sciences. Cambridge University Press, CambridgeGoogle Scholar
  16. Jenks GF, Caspall FC (1971) Error on choroplethic maps: definition, measurement, reduction. Ann Assoc Am Geogr 61: 217–244CrossRefGoogle Scholar
  17. Kertéz M, Csillag F, Kummert A (1996) Optimal tiling of heterogeneous images. Int J Remote Sens 10: 1397–1416Google Scholar
  18. Lai PC, Wong CM, Hedley AJ, Lo SV, Leung PY, Kong J, Leung GM (2004) Understanding the spatial clustering of severe acute respiratory syndrome (SARS) in Hong Kong. Environ Health Persp 112: 1550–1556CrossRefGoogle Scholar
  19. Lai PC (2005) Understanding the spatial clustering of severe acute respiratory syndrome (SARS) in Hong Kong (vol 112, pp 1550, 2004). Environ Health Persp 113:A227Google Scholar
  20. Landgrebe DA (1980) The development of a spectral-spatial classifier for Earth observational data. Pattern Recogn 12: 165–175CrossRefGoogle Scholar
  21. Lea A (1998) Trade areas: concepts, not polygons. Bus Geogr 6: 18Google Scholar
  22. Linhart H, Zucchini W (1986) Model selection. John Wiley & Sons, New YorkGoogle Scholar
  23. Maguire, DJ, Goodchild, MF, Rhind, DW (eds) (1991) Geographical information systems: principles and applications. John Wiley & SonsGoogle Scholar
  24. Monmonier M (1973) Analogs between class-interval selection and location-allocation models. Can Geogr 10: 123–131Google Scholar
  25. Munasinghe RL, Morris RD (1996) Localization of disease clusters using regional measures of spatial autocorrelation. Stat Med 15: 893–905PubMedCrossRefGoogle Scholar
  26. Pielou EC (1977) Mathematical ecology. John Wiley & Sons, New YorkGoogle Scholar
  27. Remmel TK, Csillag F (2003) When are two landscape pattern indices significantly different?. J Geogr Syst 5: 331–351CrossRefGoogle Scholar
  28. Robinson AH, Morrison JL, Muehrcke PC, Kimerling AJ, Guptill SC (1995) Elements of cartography. John Wiley & Sons, New YorkGoogle Scholar
  29. Ripley BD (1988) Statistical inference for spatial processes. John Wiley & Sons, New YorkGoogle Scholar
  30. Schowengerdt RA (1983) Techniques for image processing and classification in remote sensing. Academic Press, New YorkGoogle Scholar
  31. Schwartz G (1978) Estimating the dimension of a model. Ann Stat 5: 461–464CrossRefGoogle Scholar
  32. Stegena L, Csillag F (1986) Statistical determination of class intervals for maps. Cartogr J 24: 142–146Google Scholar
  33. Tobler WR (1970) Computer movie simulating urban growth in Detroit region. Econ Geogr 46: 234–240CrossRefGoogle Scholar
  34. Unwin D (1981) Introductory spatial analysis. Methuen, LondonGoogle Scholar
  35. Ward JH (1963) Hierarchical groupings to optimize an objective function. J Am Stat Assoc 58: 236–244CrossRefGoogle Scholar
  36. Webster R (1979) Quantitative and numerical methods in soil classification and survey. Clarendon Press, OxfordGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Ferenc Csillag
    • 1
  • Sándor Kabos
    • 2
  • Tarmo K. Remmel
    • 3
  1. 1.Department of GeographyUniversity of Toronto at MississaugaMississaugaCanada
  2. 2.Department of Statistics, Faculty of Social SciencesELTE Eötvös Loránd UniversityBudapestHungary
  3. 3.Department of GeographyYork UniversityTorontoCanada

Personalised recommendations