Skip to main content
Log in

Rough-graph-based hotspot detection of polygon vector data

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Spatial polygon data represents the area of some events such as disease cases, crime, health care facilities, earthquakes, and fires. Finding the hotspot is crucial in exploratory data analysis. Although finding the spatially significant cluster is still challenging work. On this account, in this paper, we proposed a novel method based on the rough graph that finds the statistically significant hotspot. First, Global Moan's I index is calculated to find the presence of a hotspot in the data set. A positive value of Global Moran's I index shows the presence of a hotspot in the dataset. Then, the RGBHSD algorithm is used, which constructs a rough graph by considering each polygon as the node, and there is an edge between the two nodes if two polygons are neighbours of each other. Then boundary value analysis is done on the lower region of the rough graph, which considers some more boundary value polygon to be changed as the lower region. The polygons belonging to the lower region are considered the candidate hotspot. After detecting the candidate hotspot, a statistical significance test is done to find the significant hotspot. Finally, the RGBHSD algorithm is evaluated based on the evaluation metrics. We tested the algorithm on the socio-economic dataset of UP, India and Brexit dataset of UK. In the socioeconomic dataset the health facility provided in the villages is used to find the hotspot. In the Brexit dataset field related to the percent of the vote for the UK to be in the European union or not is taken. After the analysis, it is found that the hotspots generated are denser, and the time taken by the algorithm is less and the HPAI value is high than other literature methods. The result shows that the hotspots are scattered over the study region but clustered in some areas like west UP, east UP, etc. The hotspot offers health facilities in these virtuous areas and for Brexit data hotspot is clustered in the south region. This type of analysis is suitable for dealing with the pandemic, and to understand the pattern of any disaster drought, flood etc.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Algorithm 1:
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Data availability

The data set can be downloaded from NASA Socioeconomic Data and Applications Center (SEDAC).

Code availability

QGis Software is used for the map plotting and Python 3 is used for implementation of the algorithms.

References

  1. (2018) NASA Socioeconomic Data and Applications Center (SEDAC) Documentation for the India Village-Level Geospatial Socio-Economic Data Set, v1 (1991, 2001) 1;1–18. https://sedac.ciesin.columbia.edu/downloads/docs/india/india-india-village-level-geospatial-socio-econ-1991-2001-documentation.pdf

  2. Acharjya DP, Ahmed PK (2022) A hybridized rough set and bat-inspired algorithm for knowledge inferencing in the diagnosis of chronic liver disease. Multimed Tools Appl. 13489–13512. https://doi.org/10.1007/s11042-021-11495-7

  3. Acharjya DP, Rathi R (2021) An integrated fuzzy rough set and real coded genetic algorithm approach for crop identification in smart agriculture. Multimed Tools Appl. https://doi.org/10.1007/s11042-021-10518-7

    Article  Google Scholar 

  4. Anderson TK (2009) Kernel density estimation and K-means clustering to profile road accident hotspots. Accid Anal Prev 41(3):359–364

    Article  Google Scholar 

  5. Anselin L (1995) Local Indicators of Spatial Association—LISA. Geogr Anal 27(2):93–115. https://doi.org/10.1111/j.1538-4632.1995.tb00338.x

    Article  Google Scholar 

  6. Anselin L, Syabri I, Kho Y (2006) GeoDa: An introduction to spatial data analysis. Geogr Anal 38(1):5–22. https://doi.org/10.1111/J.0016-7363.2005.00671.X

    Article  Google Scholar 

  7. Aral N, Bakir H (2021) Spatiotemporal analysis of COVID 19 in Turkey. Sustain Cities Soc 76(June 2021):0–2. https://doi.org/10.1016/j.scs.2021.103421

    Article  Google Scholar 

  8. Bai H, Li D, Ge Y, Wang J, Cao F (2022) Spatial rough set-based geographical detectors for nominal target variables. Inf Sci (N Y) 586:525–539. https://doi.org/10.1016/J.INS.2021.12.019

    Article  Google Scholar 

  9. Beynon MJ (2011) Rough Set Theory. Encyclopedia of Decision Making and Decision Support Technologies. https://doi.org/10.4018/9781599048437.ch088

  10. Block R (2007) Software review: scanning for clusters in space and time: a tutorial review of SatScan. Soc Sci Comput Rev 25(2):272–278

    Article  Google Scholar 

  11. Bundy A, Wallen L (1984) Breadth-First Search. Catalogue of Artificial Intelligence Tools. 13–13. https://doi.org/10.1007/978-3-642-96868-6_25

  12. Chainey S, Tompson L, Uhlig S (2008) The Utility of Hotspot Mapping for Predicting Spatial Patterns of Crime. Secur J 21(1–2):4–28. https://doi.org/10.1057/PALGRAVE.SJ.8350066

    Article  Google Scholar 

  13. Chen XJ, Wang Y, Xie J, Zhu X, Shan J (2021) Urban hotspots detection of taxi stops with local maximum density. Comput Environ Urban Syst 89. https://doi.org/10.1016/J.COMPENVURBSYS.2021.101661

  14. Das P, Das AK (2019) Rough set based incremental crime report labelling in dynamic environment. Appl Soft Comput J 85:105811. https://doi.org/10.1016/j.asoc.2019.105811

    Article  Google Scholar 

  15. Daszykowski M, Walczak B (2009) Density-Based Clustering Methods. Comprehensive Chemometrics 2:635–654. https://doi.org/10.1016/B978-044452701-1.00067-3

    Article  Google Scholar 

  16. Eftelioglu E, Tang X, Shekhar S (2016) Geographically Robust Hotspot Detection: A Summary of Results, Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015, pp. 1447–1456. https://doi.org/10.1109/ICDMW.2015.159

  17. EU Referendum Results - London Datastore. https://data.london.gov.uk/dataset/eu-referendum-results (Accessed 24 Mar 2023)

  18. Getis A, Ord JK (2010) The analysis of spatial association by use of distance statistics. Adv Spat Sci 61:127–145. https://doi.org/10.1007/978-3-642-01976-0_10

    Article  Google Scholar 

  19. Hart TC (2021) Investigating Crime Pattern Stability at Micro-Temporal Intervals: Implications for Crime Analysis and Hotspot Policing Strategies. Crim Justice Rev 46(2):173–189. https://doi.org/10.1177/0734016821996785

    Article  Google Scholar 

  20. Huang Z, Li J (2021) Multi-scale covering rough sets with applications to data classification. Appl Soft Comput 110:107736. https://doi.org/10.1016/j.asoc.2021.107736

    Article  Google Scholar 

  21. Jiang J-L, Fang H, Li S-Q, Li W-M (2022) Identifying important nodes for temporal networks based on the ASAM model. Physica A 586:126455

    Article  Google Scholar 

  22. Kulldorff M (1997) A spatial scan statistic. Commun Stat-Theory Methods 26(6):1481–1496

    Article  MathSciNet  Google Scholar 

  23. Lessler J, Azman AS, McKay HS, Moore SM (2017) Perspective piece: What is a hotspot anyway? Am J Trop Med Hyg 96(6):1270–1273. https://doi.org/10.4269/ajtmh.16-0427

    Article  Google Scholar 

  24. Levine N (2013) Crimestat IV: a spatial statistics program for the analysis of crime incident locations, version 4.0. Ned Levine & Associates, Houston, TX, USA

    Google Scholar 

  25. Li F, Shi W, Zhang H (2021) A Two-Phase Clustering Approach for Urban Hotspot Detection with Spatiotemporal and Network Constraints. IEEE J Sel Top Appl Earth Obs Remote Sens 14:3695–3705. https://doi.org/10.1109/JSTARS.2021.3068308

    Article  Google Scholar 

  26. Li S, Zhang K, Li Y, Wang S, Zhang S (2021) Online streaming feature selection based on neighborhood rough set. Appl Soft Comput 113:108025. https://doi.org/10.1016/J.ASOC.2021.108025

    Article  Google Scholar 

  27. Lin W et al (2021) Location-Aware Service Recommendations with Privacy-Preservation in the Internet of Things. IEEE Trans Comput Soc Syst 8(1):227–235. https://doi.org/10.1109/TCSS.2020.2965234

    Article  Google Scholar 

  28. Meiyappan P, Roy PS, Soliman A, Li T, Mondal P, Wang S, Jain, AK (2018) India village-level geospatial socio-economic data set: 1991, 2001. NASA Socioeconomic Data and Applications Center (SEDAC), Palisades, New York. https://doi.org/10.7927/H4CN71ZJ

  29. Mondal S, Singh D, Kumar R (2022) Crime hotspot detection using statistical and geospatial methods: a case study of Pune City, Maharashtra, India. GeoJournal 0123456789:1–17. https://doi.org/10.1007/s10708-022-10573-z

    Article  Google Scholar 

  30. Nandana GM, Mala S, Rawat A (2019) Hotspot detection of dengue fever outbreaks using dbscan algorithm, in 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), IEEE, pp. 158–161

  31. NirmalaDevi K, MuraliBhaskaran V (2015) Rough Set and Entropy based Feature Selection for Online Forums Hotspot Detection. Int J Comput Appl 117(10):37–41. https://doi.org/10.5120/20593-3087

    Article  Google Scholar 

  32. Ord JK, Getis A (1995) Local Spatial Autocorrelation Statistics: Distributional Issues and an Application. Geogr Anal 27(4):286–306. https://doi.org/10.1111/j.1538-4632.1995.tb00912.x

    Article  Google Scholar 

  33. Pawlak Z (1998) Rough set theory and its applications to data analysis. Cybern Syst 29(7):661–688. https://doi.org/10.1080/019697298125470

    Article  Google Scholar 

  34. Geary RC (1954) The contiguity ratio and statistical mapping. The Incorporated Statistician 5(3):115–146. https://doi.org/10.2307/2986645

    Article  Google Scholar 

  35. Raj A, Minz S (2020) Game Theory Based Pixel Approximation for Remote Sensing Imagery. Appl Soft Comput J 93:106365. https://doi.org/10.1016/j.asoc.2020.106365

    Article  Google Scholar 

  36. S. G.-T. A. of M. Statistics and undefined 1965, “Bayesian estimation in multivariate analysis,” JSTOR, Accessed 02 Jun 2023. [Online]. Available: https://www.jstor.org/stable/2238083?casa_token=isxVkF0tP60AAAAA:Brzv7ZriK9Co5tnjXLblRi47GCDCTo9lDsV3xCrKvne6xg2xN5oRpv2BpocorWq43xyxla813EvwbIsbWQAxK0wTVzSIs-oW_FZMWvWHdmAfuB7SC_3pYQ

  37. Sahu R, Dash SR, Das S (2021) Career selection of students using hybridized distance measure based on picture fuzzy set and rough set theory. Decis Making Appl Manag Eng 4(1):104–126. https://doi.org/10.31181/dmame2104104s

    Article  Google Scholar 

  38. Scott LM, Janikas MV (2010) Spatial Statistics in ArcGIS. Handbook of Applied Spatial Analysis. 27–41. https://doi.org/10.1007/978-3-642-03647-7_2

  39. Shekhar S, Evan MR, Kang JM, Mohan P (2011) Identifying patterns in spatial information: A survey of methods. Wiley Online Library 1(3):193–214. https://doi.org/10.1002/widm.25

    Article  Google Scholar 

  40. Songchitruksa P, Zeng X (2010) Getis-ord spatial statistics to identify hot spots by using incident management data. Transp Res Rec 2165:42–51. https://doi.org/10.3141/2165-05

    Article  Google Scholar 

  41. Tabarej MS, Minz S (2020) Change Footprint Pattern Analysis of Crime Hotspot of Indian Districts. In Advances in Intelligent Systems and Computing, Springer, Singapore. 325–335. https://doi.org/10.1007/978-981-15-3383-9_30

  42. Tabarej MS, Minz S (2019) Rough-Set Based Hotspot Detection in Spatial Data. In Communications in Computer and Information Science, Springer Singapore. 655–665. https://doi.org/10.1007/978-981-13-9942-8.

  43. Takahashi K, Yokoyama T, Tango T (2010) FleXScan User Guide. Accessed 02 Jun 2023. [Online]. Available: https://www.niph.go.jp/soshiki/gijutsu/download/flexscan/FleXScan%20User%20Guide_e.pdf

  44. Ulak MB, Ozguven EE, Vanli OA, Horner MW (2019) Exploring alternative spatial weights to detect crash hotspots. Comput Environ Urban Syst 78(August):101398. https://doi.org/10.1016/j.compenvurbsys.2019.101398

    Article  Google Scholar 

  45. Worboys M, Duckham M (2004) GIS: a computing perspective. CRC Press. https://books.google.co.in/books?id=x4e2IVV0u9gC&lpg=PA1&ots=_vmalfkk1b&dq=Worboys%20M%2C%20Duckham%20M%20(2004)%20GIS%3A%20a%20computing%20perspective&lr&pg=PA1#v=twopage&q&f=false

  46. Wuu JY, Pikus FG, Marek-Sadowska M (2011) Metrics for characterizing machine learning-based hotspot detection methods. In: 2011 12th International Symposium on Quality Electronic Design. IEEE, pp 1–6. https://doi.org/10.1109/ISQED.2011.5770713

  47. Xia D et al (2022) A parallel grid-search-based SVM optimization algorithm on Spark for passenger hotspot prediction. Multimed Tools Appl. https://doi.org/10.1007/s11042-022-12077-x

    Article  Google Scholar 

  48. Yu D, Xu Z, Pedrycz W (2020) Bibliometric analysis of rough sets research. Appl Soft Comput J 94:106467. https://doi.org/10.1016/j.asoc.2020.106467

    Article  Google Scholar 

  49. Zhou X, Shekhar S, Ali RY (2014) Spatiotemporal change footprint pattern discovery: An inter-disciplinary survey. Wiley Interdiscip Rev Data Min Knowl Discov 4(1):1–23. https://doi.org/10.1002/widm.1113

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Sonajharia Minz offers comprehensive direction for early manuscript drafting and preparation, as well as revisions of the paper via consecutive improvements of the article in various revision versions. The portions of the paper are written by Mohd Shamsh Tabarej. He also assumed responsibility for the corresponding authors, as well as handling the paper's modification and re-submission.

Corresponding author

Correspondence to Mohd Shamsh Tabarej.

Ethics declarations

Ethics approval

The material in the study has been ethically approved by all of the paper's authors.

Consent for publication

All the authors of the paper give their consent to publish the material presented in the paper.

Competing Interests

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tabarej, M.S., Minz, S. Rough-graph-based hotspot detection of polygon vector data. Multimed Tools Appl 83, 16683–16710 (2024). https://doi.org/10.1007/s11042-023-16246-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16246-4

Keywords

Navigation