, Volume 16, Issue 3, pp 497-521

First online:

Open Access This content is freely available online to anyone, anywhere at any time.

Processing aggregated data: the location of clusters in health data

  • Kevin BuchinAffiliated withDepartment of Mathematics and Computer Science, TU Eindhoven Email author 
  • , Maike BuchinAffiliated withDepartment of Mathematics and Computer Science, TU Eindhoven
  • , Marc van KreveldAffiliated withDepartment of Computer Science, Utrecht University
  • , Maarten LöfflerAffiliated withComputer Science Department, University of California
  • , Jun LuoAffiliated withShenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
  • , Rodrigo I. SilveiraAffiliated withDepartament de Matemàtica Aplicada II, Universitat Politècnica de Catalunya


Spatially aggregated data is frequently used in geographical applications. Often spatial data analysis on aggregated data is performed in the same way as on exact data, which ignores the fact that we do not know the actual locations of the data. We here propose models and methods to take aggregation into account. For this we focus on the problem of locating clusters in aggregated data. More specifically, we study the problem of locating clusters in spatially aggregated health data. The data is given as a subdivision into regions with two values per region, the number of cases and the size of the population at risk. We formulate the problem as finding a placement of a cluster window of a given shape such that a cluster function depending on the population at risk and the cases is maximized. We propose area-based models to calculate the cases (and the population at risk) within a cluster window. These models are based on the areas of intersection of the cluster window with the regions of the subdivision. We show how to compute a subdivision such that within each cell of the subdivision the areas of intersection are simple functions. We evaluate experimentally how taking aggregation into account influences the location of the clusters found.


Cluster Aggregated data Algorithm Public health