Spatial Itemset Mining: A Framework to Explore Itemsets in Geographic Space

  • Christian Sengstock
  • Michael Gertz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8133)

Abstract

Driven by the major adoption of mobile devices, user contributed geographic information has become ubiquitous. A typical example is georeferenced and tagged social media, linking a location to a set of features or attributes. Mining frequent sets of discrete attributes to discover interesting patterns and rules of attribute usage in such data sets is an important data mining task.

In this work we extend the frequent itemset mining framework to model the spatial distribution of itemsets and association rules. For this, we expect the input transactions to have an associated spatial attribute, as, for example, present in georeferenced tag sets. Using the framework, we formulate interestingness measures that are based on the underlying spatial distribution of the input transactions, namely area, spatial support, location-conditional support, and spatial confidence. We show that describing the spatial characteristics of itemsets cannot be handled by existing approaches to mine association rules with numeric attributes, and that the problem is different from co-location pattern mining and spatial association rules mining. We demonstrate the usefulness of our proposed extension by different mining tasks using a real-world data set from Flickr.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Rattenbury, T., Naaman, M.: Methods for extracting place semantics from Flickr tags. ACM Transactions on the Web 3(1), 1–30 (2009)CrossRefGoogle Scholar
  2. 2.
    Sakaki, T.: Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors. In: Proc. of WWW 2010, pp. 851–860 (2010)Google Scholar
  3. 3.
    Yin, Z., Cao, L., Han, J., Zhai, C., Huang, T.: Geographical Topic Discovery and Comparison. In: Proc. of WWW 2011, pp. 247–256 (2011)Google Scholar
  4. 4.
    Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of VLDB 1994, pp. 487–499 (1994)Google Scholar
  5. 5.
    Han, J., Koperski, K., Stefanovic, N.: GeoMiner: A system prototype for spatial data mining. In: Proc. of SIGMOD 1997, pp. 553–556 (1997)Google Scholar
  6. 6.
    Huang, Y., Shekhar, S., Xiong, H.: Discovering Colocation Patterns from Spatial Data Sets: A General Approach. IEEE Transactions on Knowledge and Data Engineering 16(12), 1472–1485 (2004)CrossRefGoogle Scholar
  7. 7.
    Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Mining Optimized Numeric Association Attributes. In: Proc. of PODS 1996, pp. 182–191 (1996)Google Scholar
  8. 8.
    Srikant, R., Agrawal, R.: Mining Quantitative Association Tables Rules in Large Relational Tables. In: Proc. of SIGMOD 1996, pp. 1–12 (1996)Google Scholar
  9. 9.
    Wang, K., Tay, S.H.W., Liu, B.: Interestingness-Based Interval Merger for Numeric Association. In: Proc. of KDD 1998, pp. 121–127 (1998)Google Scholar
  10. 10.
    Yang, Y., Miller, J.: Association Rules over Interval Data. In: Proc. of SIGMOD 1997, pp. 452–461 (1997)Google Scholar
  11. 11.
    Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. In: Proc. of KDD 1999, pp. 261–270 (1999)Google Scholar
  12. 12.
    Zhang, H., Padmanabhan, B., T.: On the discovery of significant statistical quantitative rules. In: Proc. of KDD 2004, pp. 374–383 (2004)Google Scholar
  13. 13.
    Huang, Y., Xiong, H., Shekhar, S., Pei, J.: Mining confident co-location rules without a support threshold. In: Proc. of SAC 2003, pp. 497–501 (2003)Google Scholar
  14. 14.
    Lin, Z., Lim, S.: Optimal candidate generation in spatial co-location mining. In: Proc. of SAC 2009, pp. 1441–1445 (2009)Google Scholar
  15. 15.
    Koperski, K., Han, J.: Discovery of Spatial Association Rules in Geographic Information Databases. In: Egenhofer, M., Herring, J.R. (eds.) SSD 1995. LNCS, vol. 951, pp. 47–66. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  16. 16.
    Sengstock, C., Gertz, M., Tran Van, C.: Spatial Interestingness Measures for Co-location Pattern Mining. In: Proc. of SSTDM (ICDM Workshop) 2012, pp. 821–826 (2012)Google Scholar
  17. 17.
    Ding, W., Eick, C.F., Wang, J., Yuan, X.: A Framework for Regional Association Rule Mining in Spatial Datasets. In: Proc. of ICDM 2006, pp. 851–856 (2006)Google Scholar
  18. 18.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer (2002)Google Scholar
  19. 19.
    Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An Efficient Data Clustering Databases Method for Very Large Databases. In: Proc. of SIGMOD 1996, pp. 103–114 (1996)Google Scholar
  20. 20.
    Wasserman, L.: All of Statistics. Springer (2004)Google Scholar
  21. 21.
    Zhang, T., Ramakrishnan, R., Livny, M.: Fast Density Estimation Using CF-kernel for Very Large Databases. In: Proc. of KDD 1999, pp. 312–316 (1999)Google Scholar
  22. 22.
    Han, J., Pei, J., Yin, Y., Mao, R.: Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery 8(1), 53–87 (2004)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Christian Sengstock
    • 1
  • Michael Gertz
    • 1
  1. 1.Database Systems Research GroupHeidelberg UniversityGermany

Personalised recommendations