Abstract
Spatial polygon data represents the area of some events such as disease cases, crime, health care facilities, earthquakes, and fires. Finding the hotspot is crucial in exploratory data analysis. Although finding the spatially significant cluster is still challenging work. On this account, in this paper, we proposed a novel method based on the rough graph that finds the statistically significant hotspot. First, Global Moan's I index is calculated to find the presence of a hotspot in the data set. A positive value of Global Moran's I index shows the presence of a hotspot in the dataset. Then, the RGBHSD algorithm is used, which constructs a rough graph by considering each polygon as the node, and there is an edge between the two nodes if two polygons are neighbours of each other. Then boundary value analysis is done on the lower region of the rough graph, which considers some more boundary value polygon to be changed as the lower region. The polygons belonging to the lower region are considered the candidate hotspot. After detecting the candidate hotspot, a statistical significance test is done to find the significant hotspot. Finally, the RGBHSD algorithm is evaluated based on the evaluation metrics. We tested the algorithm on the socio-economic dataset of UP, India and Brexit dataset of UK. In the socioeconomic dataset the health facility provided in the villages is used to find the hotspot. In the Brexit dataset field related to the percent of the vote for the UK to be in the European union or not is taken. After the analysis, it is found that the hotspots generated are denser, and the time taken by the algorithm is less and the HPAI value is high than other literature methods. The result shows that the hotspots are scattered over the study region but clustered in some areas like west UP, east UP, etc. The hotspot offers health facilities in these virtuous areas and for Brexit data hotspot is clustered in the south region. This type of analysis is suitable for dealing with the pandemic, and to understand the pattern of any disaster drought, flood etc.
Similar content being viewed by others
Data availability
The data set can be downloaded from NASA Socioeconomic Data and Applications Center (SEDAC).
Code availability
QGis Software is used for the map plotting and Python 3 is used for implementation of the algorithms.
References
(2018) NASA Socioeconomic Data and Applications Center (SEDAC) Documentation for the India Village-Level Geospatial Socio-Economic Data Set, v1 (1991, 2001) 1;1–18. https://sedac.ciesin.columbia.edu/downloads/docs/india/india-india-village-level-geospatial-socio-econ-1991-2001-documentation.pdf
Acharjya DP, Ahmed PK (2022) A hybridized rough set and bat-inspired algorithm for knowledge inferencing in the diagnosis of chronic liver disease. Multimed Tools Appl. 13489–13512. https://doi.org/10.1007/s11042-021-11495-7
Acharjya DP, Rathi R (2021) An integrated fuzzy rough set and real coded genetic algorithm approach for crop identification in smart agriculture. Multimed Tools Appl. https://doi.org/10.1007/s11042-021-10518-7
Anderson TK (2009) Kernel density estimation and K-means clustering to profile road accident hotspots. Accid Anal Prev 41(3):359–364
Anselin L (1995) Local Indicators of Spatial Association—LISA. Geogr Anal 27(2):93–115. https://doi.org/10.1111/j.1538-4632.1995.tb00338.x
Anselin L, Syabri I, Kho Y (2006) GeoDa: An introduction to spatial data analysis. Geogr Anal 38(1):5–22. https://doi.org/10.1111/J.0016-7363.2005.00671.X
Aral N, Bakir H (2021) Spatiotemporal analysis of COVID 19 in Turkey. Sustain Cities Soc 76(June 2021):0–2. https://doi.org/10.1016/j.scs.2021.103421
Bai H, Li D, Ge Y, Wang J, Cao F (2022) Spatial rough set-based geographical detectors for nominal target variables. Inf Sci (N Y) 586:525–539. https://doi.org/10.1016/J.INS.2021.12.019
Beynon MJ (2011) Rough Set Theory. Encyclopedia of Decision Making and Decision Support Technologies. https://doi.org/10.4018/9781599048437.ch088
Block R (2007) Software review: scanning for clusters in space and time: a tutorial review of SatScan. Soc Sci Comput Rev 25(2):272–278
Bundy A, Wallen L (1984) Breadth-First Search. Catalogue of Artificial Intelligence Tools. 13–13. https://doi.org/10.1007/978-3-642-96868-6_25
Chainey S, Tompson L, Uhlig S (2008) The Utility of Hotspot Mapping for Predicting Spatial Patterns of Crime. Secur J 21(1–2):4–28. https://doi.org/10.1057/PALGRAVE.SJ.8350066
Chen XJ, Wang Y, Xie J, Zhu X, Shan J (2021) Urban hotspots detection of taxi stops with local maximum density. Comput Environ Urban Syst 89. https://doi.org/10.1016/J.COMPENVURBSYS.2021.101661
Das P, Das AK (2019) Rough set based incremental crime report labelling in dynamic environment. Appl Soft Comput J 85:105811. https://doi.org/10.1016/j.asoc.2019.105811
Daszykowski M, Walczak B (2009) Density-Based Clustering Methods. Comprehensive Chemometrics 2:635–654. https://doi.org/10.1016/B978-044452701-1.00067-3
Eftelioglu E, Tang X, Shekhar S (2016) Geographically Robust Hotspot Detection: A Summary of Results, Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015, pp. 1447–1456. https://doi.org/10.1109/ICDMW.2015.159
EU Referendum Results - London Datastore. https://data.london.gov.uk/dataset/eu-referendum-results (Accessed 24 Mar 2023)
Getis A, Ord JK (2010) The analysis of spatial association by use of distance statistics. Adv Spat Sci 61:127–145. https://doi.org/10.1007/978-3-642-01976-0_10
Hart TC (2021) Investigating Crime Pattern Stability at Micro-Temporal Intervals: Implications for Crime Analysis and Hotspot Policing Strategies. Crim Justice Rev 46(2):173–189. https://doi.org/10.1177/0734016821996785
Huang Z, Li J (2021) Multi-scale covering rough sets with applications to data classification. Appl Soft Comput 110:107736. https://doi.org/10.1016/j.asoc.2021.107736
Jiang J-L, Fang H, Li S-Q, Li W-M (2022) Identifying important nodes for temporal networks based on the ASAM model. Physica A 586:126455
Kulldorff M (1997) A spatial scan statistic. Commun Stat-Theory Methods 26(6):1481–1496
Lessler J, Azman AS, McKay HS, Moore SM (2017) Perspective piece: What is a hotspot anyway? Am J Trop Med Hyg 96(6):1270–1273. https://doi.org/10.4269/ajtmh.16-0427
Levine N (2013) Crimestat IV: a spatial statistics program for the analysis of crime incident locations, version 4.0. Ned Levine & Associates, Houston, TX, USA
Li F, Shi W, Zhang H (2021) A Two-Phase Clustering Approach for Urban Hotspot Detection with Spatiotemporal and Network Constraints. IEEE J Sel Top Appl Earth Obs Remote Sens 14:3695–3705. https://doi.org/10.1109/JSTARS.2021.3068308
Li S, Zhang K, Li Y, Wang S, Zhang S (2021) Online streaming feature selection based on neighborhood rough set. Appl Soft Comput 113:108025. https://doi.org/10.1016/J.ASOC.2021.108025
Lin W et al (2021) Location-Aware Service Recommendations with Privacy-Preservation in the Internet of Things. IEEE Trans Comput Soc Syst 8(1):227–235. https://doi.org/10.1109/TCSS.2020.2965234
Meiyappan P, Roy PS, Soliman A, Li T, Mondal P, Wang S, Jain, AK (2018) India village-level geospatial socio-economic data set: 1991, 2001. NASA Socioeconomic Data and Applications Center (SEDAC), Palisades, New York. https://doi.org/10.7927/H4CN71ZJ
Mondal S, Singh D, Kumar R (2022) Crime hotspot detection using statistical and geospatial methods: a case study of Pune City, Maharashtra, India. GeoJournal 0123456789:1–17. https://doi.org/10.1007/s10708-022-10573-z
Nandana GM, Mala S, Rawat A (2019) Hotspot detection of dengue fever outbreaks using dbscan algorithm, in 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), IEEE, pp. 158–161
NirmalaDevi K, MuraliBhaskaran V (2015) Rough Set and Entropy based Feature Selection for Online Forums Hotspot Detection. Int J Comput Appl 117(10):37–41. https://doi.org/10.5120/20593-3087
Ord JK, Getis A (1995) Local Spatial Autocorrelation Statistics: Distributional Issues and an Application. Geogr Anal 27(4):286–306. https://doi.org/10.1111/j.1538-4632.1995.tb00912.x
Pawlak Z (1998) Rough set theory and its applications to data analysis. Cybern Syst 29(7):661–688. https://doi.org/10.1080/019697298125470
Geary RC (1954) The contiguity ratio and statistical mapping. The Incorporated Statistician 5(3):115–146. https://doi.org/10.2307/2986645
Raj A, Minz S (2020) Game Theory Based Pixel Approximation for Remote Sensing Imagery. Appl Soft Comput J 93:106365. https://doi.org/10.1016/j.asoc.2020.106365
S. G.-T. A. of M. Statistics and undefined 1965, “Bayesian estimation in multivariate analysis,” JSTOR, Accessed 02 Jun 2023. [Online]. Available: https://www.jstor.org/stable/2238083?casa_token=isxVkF0tP60AAAAA:Brzv7ZriK9Co5tnjXLblRi47GCDCTo9lDsV3xCrKvne6xg2xN5oRpv2BpocorWq43xyxla813EvwbIsbWQAxK0wTVzSIs-oW_FZMWvWHdmAfuB7SC_3pYQ
Sahu R, Dash SR, Das S (2021) Career selection of students using hybridized distance measure based on picture fuzzy set and rough set theory. Decis Making Appl Manag Eng 4(1):104–126. https://doi.org/10.31181/dmame2104104s
Scott LM, Janikas MV (2010) Spatial Statistics in ArcGIS. Handbook of Applied Spatial Analysis. 27–41. https://doi.org/10.1007/978-3-642-03647-7_2
Shekhar S, Evan MR, Kang JM, Mohan P (2011) Identifying patterns in spatial information: A survey of methods. Wiley Online Library 1(3):193–214. https://doi.org/10.1002/widm.25
Songchitruksa P, Zeng X (2010) Getis-ord spatial statistics to identify hot spots by using incident management data. Transp Res Rec 2165:42–51. https://doi.org/10.3141/2165-05
Tabarej MS, Minz S (2020) Change Footprint Pattern Analysis of Crime Hotspot of Indian Districts. In Advances in Intelligent Systems and Computing, Springer, Singapore. 325–335. https://doi.org/10.1007/978-981-15-3383-9_30
Tabarej MS, Minz S (2019) Rough-Set Based Hotspot Detection in Spatial Data. In Communications in Computer and Information Science, Springer Singapore. 655–665. https://doi.org/10.1007/978-981-13-9942-8.
Takahashi K, Yokoyama T, Tango T (2010) FleXScan User Guide. Accessed 02 Jun 2023. [Online]. Available: https://www.niph.go.jp/soshiki/gijutsu/download/flexscan/FleXScan%20User%20Guide_e.pdf
Ulak MB, Ozguven EE, Vanli OA, Horner MW (2019) Exploring alternative spatial weights to detect crash hotspots. Comput Environ Urban Syst 78(August):101398. https://doi.org/10.1016/j.compenvurbsys.2019.101398
Worboys M, Duckham M (2004) GIS: a computing perspective. CRC Press. https://books.google.co.in/books?id=x4e2IVV0u9gC&lpg=PA1&ots=_vmalfkk1b&dq=Worboys%20M%2C%20Duckham%20M%20(2004)%20GIS%3A%20a%20computing%20perspective&lr&pg=PA1#v=twopage&q&f=false
Wuu JY, Pikus FG, Marek-Sadowska M (2011) Metrics for characterizing machine learning-based hotspot detection methods. In: 2011 12th International Symposium on Quality Electronic Design. IEEE, pp 1–6. https://doi.org/10.1109/ISQED.2011.5770713
Xia D et al (2022) A parallel grid-search-based SVM optimization algorithm on Spark for passenger hotspot prediction. Multimed Tools Appl. https://doi.org/10.1007/s11042-022-12077-x
Yu D, Xu Z, Pedrycz W (2020) Bibliometric analysis of rough sets research. Appl Soft Comput J 94:106467. https://doi.org/10.1016/j.asoc.2020.106467
Zhou X, Shekhar S, Ali RY (2014) Spatiotemporal change footprint pattern discovery: An inter-disciplinary survey. Wiley Interdiscip Rev Data Min Knowl Discov 4(1):1–23. https://doi.org/10.1002/widm.1113
Author information
Authors and Affiliations
Contributions
Sonajharia Minz offers comprehensive direction for early manuscript drafting and preparation, as well as revisions of the paper via consecutive improvements of the article in various revision versions. The portions of the paper are written by Mohd Shamsh Tabarej. He also assumed responsibility for the corresponding authors, as well as handling the paper's modification and re-submission.
Corresponding author
Ethics declarations
Ethics approval
The material in the study has been ethically approved by all of the paper's authors.
Consent for publication
All the authors of the paper give their consent to publish the material presented in the paper.
Competing Interests
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tabarej, M.S., Minz, S. Rough-graph-based hotspot detection of polygon vector data. Multimed Tools Appl 83, 16683–16710 (2024). https://doi.org/10.1007/s11042-023-16246-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16246-4