Abstract
Driven by the major adoption of mobile devices, user contributed geographic information has become ubiquitous. A typical example is georeferenced and tagged social media, linking a location to a set of features or attributes. Mining frequent sets of discrete attributes to discover interesting patterns and rules of attribute usage in such data sets is an important data mining task.
In this work we extend the frequent itemset mining framework to model the spatial distribution of itemsets and association rules. For this, we expect the input transactions to have an associated spatial attribute, as, for example, present in georeferenced tag sets. Using the framework, we formulate interestingness measures that are based on the underlying spatial distribution of the input transactions, namely area, spatial support, location-conditional support, and spatial confidence. We show that describing the spatial characteristics of itemsets cannot be handled by existing approaches to mine association rules with numeric attributes, and that the problem is different from co-location pattern mining and spatial association rules mining. We demonstrate the usefulness of our proposed extension by different mining tasks using a real-world data set from Flickr.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rattenbury, T., Naaman, M.: Methods for extracting place semantics from Flickr tags. ACM Transactions on the Web 3(1), 1–30 (2009)
Sakaki, T.: Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors. In: Proc. of WWW 2010, pp. 851–860 (2010)
Yin, Z., Cao, L., Han, J., Zhai, C., Huang, T.: Geographical Topic Discovery and Comparison. In: Proc. of WWW 2011, pp. 247–256 (2011)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of VLDB 1994, pp. 487–499 (1994)
Han, J., Koperski, K., Stefanovic, N.: GeoMiner: A system prototype for spatial data mining. In: Proc. of SIGMOD 1997, pp. 553–556 (1997)
Huang, Y., Shekhar, S., Xiong, H.: Discovering Colocation Patterns from Spatial Data Sets: A General Approach. IEEE Transactions on Knowledge and Data Engineering 16(12), 1472–1485 (2004)
Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Mining Optimized Numeric Association Attributes. In: Proc. of PODS 1996, pp. 182–191 (1996)
Srikant, R., Agrawal, R.: Mining Quantitative Association Tables Rules in Large Relational Tables. In: Proc. of SIGMOD 1996, pp. 1–12 (1996)
Wang, K., Tay, S.H.W., Liu, B.: Interestingness-Based Interval Merger for Numeric Association. In: Proc. of KDD 1998, pp. 121–127 (1998)
Yang, Y., Miller, J.: Association Rules over Interval Data. In: Proc. of SIGMOD 1997, pp. 452–461 (1997)
Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. In: Proc. of KDD 1999, pp. 261–270 (1999)
Zhang, H., Padmanabhan, B., T.: On the discovery of significant statistical quantitative rules. In: Proc. of KDD 2004, pp. 374–383 (2004)
Huang, Y., Xiong, H., Shekhar, S., Pei, J.: Mining confident co-location rules without a support threshold. In: Proc. of SAC 2003, pp. 497–501 (2003)
Lin, Z., Lim, S.: Optimal candidate generation in spatial co-location mining. In: Proc. of SAC 2009, pp. 1441–1445 (2009)
Koperski, K., Han, J.: Discovery of Spatial Association Rules in Geographic Information Databases. In: Egenhofer, M., Herring, J.R. (eds.) SSD 1995. LNCS, vol. 951, pp. 47–66. Springer, Heidelberg (1995)
Sengstock, C., Gertz, M., Tran Van, C.: Spatial Interestingness Measures for Co-location Pattern Mining. In: Proc. of SSTDM (ICDM Workshop) 2012, pp. 821–826 (2012)
Ding, W., Eick, C.F., Wang, J., Yuan, X.: A Framework for Regional Association Rule Mining in Spatial Datasets. In: Proc. of ICDM 2006, pp. 851–856 (2006)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer (2002)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An Efficient Data Clustering Databases Method for Very Large Databases. In: Proc. of SIGMOD 1996, pp. 103–114 (1996)
Wasserman, L.: All of Statistics. Springer (2004)
Zhang, T., Ramakrishnan, R., Livny, M.: Fast Density Estimation Using CF-kernel for Very Large Databases. In: Proc. of KDD 1999, pp. 312–316 (1999)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery 8(1), 53–87 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sengstock, C., Gertz, M. (2013). Spatial Itemset Mining: A Framework to Explore Itemsets in Geographic Space. In: Catania, B., Guerrini, G., Pokorný, J. (eds) Advances in Databases and Information Systems. ADBIS 2013. Lecture Notes in Computer Science, vol 8133. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40683-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-40683-6_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40682-9
Online ISBN: 978-3-642-40683-6
eBook Packages: Computer ScienceComputer Science (R0)