Skip to main content

Advertisement

Log in

Mining Co-Location Patterns with Rare Events from Spatial Data Sets

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

A co-location pattern is a group of spatial features/events that are frequently co-located in the same region. For example, human cases of West Nile Virus often occur in regions with poor mosquito control and the presence of birds. For co-location pattern mining, previous studies often emphasize the equal participation of every spatial feature. As a result, interesting patterns involving events with substantially different frequency cannot be captured. In this paper, we address the problem of mining co-location patterns with rare spatial features. Specifically, we first propose a new measure called the maximal participation ratio (maxPR) and show that a co-location pattern with a relatively high maxPR value corresponds to a co-location pattern containing rare spatial events. Furthermore, we identify a weak monotonicity property of the maxPR measure. This property can help to develop an efficient algorithm to mine patterns with high maxPR values. As demonstrated by our experiments, our approach is effective in identifying co-location patterns with rare events, and is efficient and scalable for large-scale data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. E. Brockovich, http://www.masryvititoe.com/erin_brockovich.shtml.

  2. “West Nile disease,” in http://www.cdc.gov/ncidod/dvbid/westnile/index.htm.

  3. R. Agarwal, T. Imielinski, and A. Swami. “Mining association rules between sets of items in large databases,” in Proc. of the ACM SIGMOD Conference on Management of Data, Washington, DC, pp. 207–216, 1993.

  4. R. Agarwal, and R. Srikant. “Fast algorithms for mining association rules,” in Proc. of the 20th Int’l Conference on Very Large Data Bases, Santiago, Chile, pp. 487–499, 1994.

  5. L. Arge, O. Procopiuc, S. Ramaswamy, T. Suel, and J. Vitter. “Scalable sweeping-based spatial join,” in Proc. of the Int’l Conference on Very Large Databases, Morgan Kaufman, San Mateo, CA, pp. 570–581, 1998.

    Google Scholar 

  6. Y. Chou. Exploring Spatial Analysis in Geographic Information System. Onward Press: Santa Fe, NM ISBN:1566901197, 1997.

  7. N.A.C. Cressie. Statistics for Spatial Data. Wiley: New York ISBN:0471843369, 1991.

    Google Scholar 

  8. Environmental Systems Research Institute, Inc. “ArcGIS Family,” in http://www.esri.com.

  9. M. Ester, A. Frommelt, H.-P. Kriegel, and J. Sander. “Algorithms for characterization and trend detection in spatial databases,” in Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, pp. 44–50, 1998.

  10. V. Estivill-Castro, and I. Lee. “Data mining techniques for autonomous exploration of large volumes of geo-referenced crime data,” in Proc. of the 6th International Conference on Geocomputation, pp. 24–26, 2001.

  11. V. Estivill-Castro, and A. Murray. “Discovering associations in spatial Data—an efficient medoid based approach,” in Proc. of the Second Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, Berlin Heidelberg New York, pp. 110–121, 1998.

    Google Scholar 

  12. R.H. Guting. “An introduction to spatial database systems,” Very Large Data Bases Journal, Vol. 3: 357–399, 1994.

    Article  Google Scholar 

  13. Y. Huang, H. Xiong, S. Shekhar, and J. Pei. “Mining confident co-location rules without a support threshold,” in Proceedings of the 18th Annual ACM Symposium on Applied Computing (SAC’03), Melbourne, Florida, pp. 497–418, 2003.

  14. J. M. Patel, and D. J. DeWitt. “Partition based spatial-merge join,” in Proc. of the ACM SIGMOD Conference on Management of Data, pp. 259–270, 1996.

  15. E. M. Knorr, R. T. Ng, and D. L. Shilvock. “Finding boundary shape matching relationships in spatial data,” in Proc. 5th International Symposium on Spatial Databases, Springer, Berlin Heidelberg New York, pp. 29–46, 1997.

    Google Scholar 

  16. K. Koperski, J. Adhikary, and J. Han. “Spatial data mining: Progress and challenges,” in Workshop on Research Issues on Data Mining and Knowledge Discovery, pp. 409–418, Oxford University Press, UK, 1996.

    Google Scholar 

  17. K. Koperski, and J. Han. “Discovery of spatial association rules in geographic information databases,” in Proc. of the 4th International Symposium on Spatial Databases, Springer, Berlin Heidelberg New York, pp. 47–66, 1995.

    Google Scholar 

  18. M. Koubarakis, T.K. Sellis, A.U. Frank, S.Grumbach, R.H. Güting, C.S. Jensen, N.A. Lorentzos, Y. Manolopoulos, E. Nardelli, B. Pernici, H.-J. Schek, M.Scholl, B. Theodoulidis, and N. Tryfona. Spatio-Temporal Databases: The CHOROCHRONOS Approach. Springer: Berlin Heidelberg New York, 2003.

    Google Scholar 

  19. S.T. Leutenegger, and M.A. Lopez. “The Effect of buffering on the performance of R-trees,” in Proc. of the Int’l Conference on Data Engineering, IEEE Educational Activities Department, pp. 164–171, 1998.

  20. Y. Morimoto. “Mining frequent neighboring class sets in spatial databases,” in Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 353–358, 2001.

  21. R. Munro, S. Chawla, P. Sun. “Complex spatial relationships,” The Third IEEE International Conference on Data Mining (ICDM2003), IEEE Computer Society, p. 227, 2003.

  22. R. T. Ng, and J. Han. “Efficient and effective clustering methods for spatial data mining,” in 20th International Conference on Very Large Data Bases, Morgan Kaufman, San Mateo, CA, pp. 144–155, 1994.

    Google Scholar 

  23. J.F. Roddick, and M. Spiliopoulou. “A bibliography of temporal, spatial and spatio-temporal data mining research,” in ACM Special Interest Group on Knowledge Discovery in Data Mining Explorations, New York, pp. 34–38, 1999.

  24. S. Shekhar, and S. Chawla. Spatial Databases: A Tour. Prentice Hall: New Jersey ISBN: 0130174807, 2003.

    Google Scholar 

  25. S. Shekhar, S. Chawla, S. Ravada, A. Fetterer, X. Liu, and C.T. Lu. “Spatial databases: Accomplishments and research needs,” IEEE Trans. Knowl.Data Eng., Vol. 11(1):45–55, 1999.

    Article  Google Scholar 

  26. S. Shekhar, and Y. Huang. “Co-location rules mining: A summary of results,” in Proc. 7th Intl. Symposium on Spatio-temporal Databases, Springer, Berlin Heidelberg New York, p.236, 2001.

    Google Scholar 

  27. S. Shekhar, C.T. Lu, and P. Zhang. “Detecting graph-based spatial outliers: Algorithms and applications,” in The Seventh ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, San Francisco, California, pp. 371–376, 2001.

  28. S. Shekhar, P. Schrater, W.R. Raju, W. Wu, and S. Chawla. “Spatial contextual classification and prediction models for mining geospatial data,” in IEEE Transactions on Multimedia: Special Issue on Multimedia Databases, IEEE Trans. Multimedia, pp. 174–188, 2002.

  29. W. Wang, J. Yang, and R. Muntz. “STING: A statistical information grid approach to spatial data mining” in International Conference on Very Large Data Bases, Athens, Greece, Morgan Kaufman, San Mateo, CA, pp. 186–195, 1997.

    Google Scholar 

  30. M.F. Worboys. GS: A Computing Perspective. Taylor and Francis: New York 1995.

    Google Scholar 

  31. X. Zhang, N. Mamoulis, D.W.L Cheung, and Y. Shou. “Fast mining of spatial collocations,” in Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, pp. 384–393, 2004.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Huang.

Additional information

A preliminary version of the paper appeared as [13].

The research of the second author is supported in part by Natural Sciences and Engineering Research Council of Canada under grant number 312194-05 and National Science Foundation of the United States under grant number IIS-0308001. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agency.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, Y., Pei, J. & Xiong, H. Mining Co-Location Patterns with Rare Events from Spatial Data Sets. Geoinformatica 10, 239–260 (2006). https://doi.org/10.1007/s10707-006-9827-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-006-9827-8

Keywords

Navigation