Skip to main content

Interestingness Hotspot Discovery in Spatial Datasets Using a Graph-Based Approach

  • Conference paper
  • First Online:
Machine Learning and Data Mining in Pattern Recognition (MLDM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9729))

Abstract

This paper proposes a novel methodology for discovering interestingness hotspots in spatial datasets using a graph-based algorithm. We define interestingness hotspots as contiguous regions in space which are interesting based on a domain expert’s notion of interestingness captured by an interestingness function. In our recent work, we proposed a computational framework which discovers interestingness hotspots in gridded datasets using a 3-step approach which consists of seeding, hotspot growing and post-processing steps. In this work, we extend our framework to discover hotspots in any given spatial dataset. We propose a methodology which firstly creates a neighborhood graph for the given dataset and then identifies seed regions in the graph using the interestingness measure. Next, we grow interestingness hotspots from seed regions by adding neighboring nodes, maximizing the given interestingness function. Finally after all interestingness hotspots are identified, we create a polygon model for each hotspot using an approach that uses Voronoi tessellations and the convex hull of the objects belonging to the hotspot. The proposed methodology is evaluated in a case study for a 2-dimensional earthquake dataset in which we find interestingness hotspots based on variance and correlation interestingness functions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akdag, F., Davis, J.U., Eick, C.F.: A computational framework for finding interestingness hotspots in large spatio-temporal grids. In: Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, pp. 21–29. ACM, November 2014

    Google Scholar 

  2. Akdag, F., Eick, C.F.: An optimized interestingness hotspot discovery framework for large gridded spatio-temporal datasets. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 2010–2019. IEEE, October 2015

    Google Scholar 

  3. Gabriel, K.R., Sokal, R.R.: A New Statistical Approach to Geographic Variation Analysis. Systematic Zoology 18, 259–278 (1969)

    Article  Google Scholar 

  4. Miller, R., Chen, C., Eick, C.F., Bagherjeiran, A.: A framework for spatial feature selection and scoping and its application to geo-targeting. In: 2011 IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services (ICSDM), pp. 26–31. IEEE (2011)

    Google Scholar 

  5. Kulldorff, M.: A spatial scan statistic. Communications in Statistics-Theory and methods 26(6), 1481–1496 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  6. Varlaro, A., Appice, A., Lanza, A., Malerba, D.: An ILP Approach to Spatial Clustering. Convegno Italiano di Logica Computazionale, Roma (2005)

    MATH  Google Scholar 

  7. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96(34), 226–231 (1996)

    Google Scholar 

  8. Ertöz, L., Steinbach, M., Kumar, V.: Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. SDM, pp. 47–58 (2003)

    Google Scholar 

  9. Wang, X., Hamilton, H.J.: A comparative study of two density-based spatial clustering algorithms for very large datasets. In: Kégl, B., Lee, H.-H. (eds.) Canadian AI 2005. LNCS (LNAI), vol. 3501, pp. 120–132. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. Cao, Z., Wang, S., Forestier, G., Puissant, A., Eick, C.F.: Analyzing the composition of cities using spatial clustering. In: Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing, p. 14. ACM (2013)

    Google Scholar 

  11. Choo, J., Jiamthapthaksin, R., Chen, C.-S., Celepcikay, O.U., Giusti, C., Eick, C.F.: MOSAIC: a proximity graph approach for agglomerative clustering. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 231–240. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Matula, D.W., Sokal, R.R.: Properties of Gabriel graphs relevant to geographic variation research and the clustering of points in the plane. Geographical analysis 12(3), 205–222 (1980)

    Article  Google Scholar 

  13. Jaromczyk, J.W., Toussaint, G.T.: Relative neighborhood graphs and their relatives. Proceedings of the IEEE 80(9), 1502–1517 (1992)

    Article  Google Scholar 

  14. United States Geological Survey (USGS). http://earthquake.usgs.gov/earthquakes/search/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fatih Akdag .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Akdag, F., Eick, C.F. (2016). Interestingness Hotspot Discovery in Spatial Datasets Using a Graph-Based Approach. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41920-6_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41919-0

  • Online ISBN: 978-3-319-41920-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics