SSCP: Mining Statistically Significant Co-location Patterns
Co-location pattern discovery searches for subsets of spatial features whose instances are often located at close spatial proximity. Current algorithms using user specified thresholds for prevalence measures may report co-locations even if the features are randomly distributed. In our model, we look for subsets of spatial features which are co-located due to some form of spatial dependency but not by chance. We first introduce a new definition of co-location patterns based on a statistical test. Then we propose an algorithm for finding such co-location patterns where we adopt two strategies to reduce computational cost compared to a naïve approach based on simulations of the data distribution. We propose a pruning strategy for computing the prevalence measures. We also show that instead of generating all instances of an auto-correlated feature during a simulation, we could generate a reduced number of instances for the prevalence measure computation. We evaluate our algorithm empirically using synthetic and real data and compare our findings with the results found in a state-of-the-art co-location mining algorithm.
Unable to display preview. Download preview PDF.