1 Introduction

The globe is grappling with the development of novel coronavirus illnesses (nCoV) and subsequently (CoVID-19), which is thought to be caused by the virus strain SARS-CoV-2 and has been ravaging China since early December 2019. As the globe tries to find a treatment for the disease through vaccine research, the transmission, direction, and rate at which the disease spreads have become a critical information gap [1]. The pattern and trajectory of the COVID-19 outbreak are still unknown, and the specific variables influencing the disease's geographical distribution as well as the initiation of transmission from nation to nation are yet to be fully explored.

CoVID-19 has spread fast around the planet, posing significant health, economic, environmental, and social risks to the whole human population [2]. Major outbreaks pose an immediate threat to all human beings' economic and social growth. In geography and spatial epidemiology, summarizing and assessing the geographical temporal distribution patterns and regularities of infectious illnesses [3], dissemination patterns [4], spatial consequences [5], geographic influence [6], and public health relevance are key research areas. Several studies have highlighted the relevance of using GIS in health geography investigations and epidemiology and the usefulness of its implementation in addressing spatial challenges. GIS and strategies in health geographies have previously included digital real-time or near-real-time machinations of disease cases, their distribution, and forecasting of risk or vulnerability mapping, along with coding and appearance of social media interactions, population travel data, and interaction trajectories tracking across space and time [7, 8].

Since the COVID-19 epidemic's onset, a number of academics and researchers have used GIS technology to examine the pandemic's spread. It is believed that the CoVID-19 pandemic is rife with unknowns, and many of them have a spatial component that makes the phenomenon understandable as being geographical and possibly mappable [9]. The spatio-temporal dynamics of the CoVID-19 pandemic in the State of Kuwait are the basis for Alkhamis and colleagues' (2020) investigation. Using daily confirmed cases, they created an exploratory study on the temporal and spatiotemporal dynamics of the CoVID-19 pandemic in Kuwait [10]. The in-depth examination of CoVID-19 in Iran includes mapping of risks, change detection, geographic modelling of disease dissemination, and trend analysis of outbreaks. Researchers compared Iranian coronavirus data with global trends, used regression modelling to forecast death trends, used spatial modelling, risk mapping, and change detection using the random forest (RF) machine learning approach (MLT), and verified the risk map they had predicted [11]. A different study seeks to identify the CoVID hotspot region in India using data on the total population density, foreign visitors, and confirmed CoVID cases. By using geospatial technology, another study deduces the spatiotemporal pattern of CoVID clustering for India [12]. Few others made an effort to comprehend the CoVID-19 hotspot spatial pattern, clustering, spatial direction, and heterogeneity [13, 14].

Spatial cluster analysis is dependent on the geography of the activities and necessitates accurate and meaningful handling of space and spatial linkages together with the observed location and event data. It has so far necessitated the application of particular structural and accounting procedures and approaches for distance, outliers, contiguity, geographical irregularity, and so on [14]. However, most of these efforts have been made at a national, state, or district level. Very few studies specifically focus on developing nations in Asia and show the CoVID-19 clustering at a local or neighborhood scale [15]. Assessing the spatial pattern of the CoVID-19 hotspots, which comprises clustering, spatial direction, and heterogeneity, may be essential from the standpoint of micro-planning. We seek to examine the CoVID-19 spread at the block level by using a case study on the North 24 Parganas of West Bengal, India, the most vulnerable district in West Bengal. The specific goal of this study is to apply the spatial auto-correlation technique to assess the CoVID spatial pattern and find clusters with statistically significant disease hotspots at the neighborhood scale. Based on the containment zones, we evaluate the spatiotemporal distribution of CoVID-19 and map its hotspots. Containment zones are areas with a high concentration of CoVID-19-positive cases demarcated during nationwide unlock.Footnote 1 In order to map and stop local transmission, these zones were constructed by the Rapid Response Team, Union Ministry for Health and Family Welfare, Government of India.

2 Method

North 24 Parganas, the most populous district in West Bengal, is bordered to the north by Nadia district, to the south by South 24-Parganas, and to the east by Bangladesh (Fig. 1). The new alluvium sub-region (Zone-III) of the lower Gangetic Plain, which is considered to be the most productive for agricultural production, is where the district of North 24-Parganas is situated. The district has mostly flat topography. The North 24-Parganas district is traversed by numerous rivers, including the Ichhamati, Kalindi, Raimangal, Dansa, Borokalagachi, Benti, Haribhanga, Gourchrar, Bidyadhari, and Hooghly. It experiences tropical weather, just like the rest of Gangetic West Bengal. Another characteristic of the area is the monsoon, which occurs from early June to mid-September and sporadically into October. The weather is dry from late November to mid-February during the winter, while it is very humid during the summer. North 24 Parganas has a population of 10,009,781 people, with 5,119,389 men and 4,890,392 women, according to the 2011 Indian census. The district covers 4094 square kilometers. A population density of 2445 person per square kilometer was recorded in the North Twenty-four Parganas district in 2011. However, the population density varies over urban to rural block from 655 to 3899 persons per square kilometers (Fig. 2). The average literacy rate in North Twenty-four Parganas was 84.06 in 2011, up from 78.07 in 2001. The sex ratio was 955 for every 1000 men as per 2011 Census.

Fig. 1
figure 1

Location map of study area (a). India—1. Andaman and Nicobar; 2. Andhra Pradesh; 3. Arunachal Pradesh; 4. Assam; 5. Bihar; 6. Chandigarh; 7. Chhattisgarh. 8. Dadra and Nagar Haveli; 9. Daman and Diu; 10. Goa; 11. Gujarat; 12. Haryana; 13. Himachal Pradesh; 14. Jammu and Kashmir; 15. Jharkhand; 16. Karnataka; 17. Kerala; 18. Lakshadweep; 19. Madhya Pradesh; 20. Maharashtra; 21. Manipur; 22. Meghalaya; 23. Mizoram; 24. Nagaland; 25. NCT of Delhi; 26. Odisha; 27. Puducherry; 28. Punjab; 29. Rajasthan; 30. Sikkim; 31. Tamil Nadu; 32. Telangana; 33. Tripura; 34. Uttar Pradesh; 35. Uttarakhand; 36. West Bengal. (b). West Bengal—1. Alipurduar; 2. Bankura; 3. Barddhaman; 4. Birbhum; 5. Dakshin Dinajpur; 6. Darjiling; 7. Haora; 8. Hugli; 9. Jalpaiguri; 10. Koch Bihar; 11. Kolkata; 12. Maldah; 13. Murshidabad; 14. Nadia; 15. North 24 Parganas; 16. Pashchim Medinipur; 17. Purba Medinipur; 18. Puruliya; 19. South 24 Parganas; 20. Uttar Dinajpur. (c). North 24 Parganas—1. Bagda; 2. Bongaon; 3. Gaighata; 4. Habra—I; 5. Swarupnagar; 6. Baduria; 7. Bashirhat- I; 8. Bashirhat- II; 9. Deganga; 10. Hasnabad; 11. Haroa; 12. Minakhan; 13. Sandeshkhali—I; 14. Sandeshkhali—II; 15. Hingalganj; 16. Habra- II; 17. Barasat- I; 18. Barasat- II; 19. Rajarhat; 20. Barrackpore—II; 21. Amdanga; 22. Barrackpore—I

Fig. 2
figure 2

Study area at a glance (a). Rural and Urban Settlement and (b). Population Density, Census 2011—1. Bagda; 2. Bongaon; 3. Gaighata; 4. Habra—I; 5. Swarupnagar; 6. Baduria; 7. Bashirhat- I; 8. Bashirhat- II; 9. Deganga; 10. Hasnabad; 11. Haroa; 12. Minakhan; 13. Sandeshkhali—I; 14. Sandeshkhali—II; 15. Hingalganj; 16. Habra- II; 17. Barasat- I; 18. Barasat- II; 19. Rajarhat; 20. Barrackpore—II; 21. Amdanga; 22. Barrackpore—I

Since the breakout of the covid, the district has been significantly impacted by the tantrum of coronavirus. Over 1000 new CoVID cases are reported each day in North 24 Parganas during the study period of unlock (May 2020–October 2020). Most researchers concur that the actual number of affected individuals in the state should be at least five times higher than the reported number because many patients have not been examined. The West Bengal Health Bulletin of North 24 Parganas reports that CoVID-19 incidences peaked between March 17 and May 14, 2020. The number of CoVID-19 cases in North 24 Parganas has reportedly almost reached its peak during this period, but starting at the end of October 2020, the cases have started to decline. This suggests that either the curfew or total lockdown has helped to slow the uptick. It was anticipated that COVID-19 cases would decrease further. Based on the aforementioned discussion, the present study method utilizes the data available on containment zone demarcation from May 2020 to October 2020 to illustrate the geographic distribution and cluster patterns of CoVID-19 at the micro-level in the North 24 Parganas of West Bengal, India. This information would be helpful for future effective control of the epidemic outbreak.

The study was conducted in two stages: first, we collected data on all confirmed CoVID-19 cases and containment zones in the North 24 Parganas region of West Bengal, India. Next, we created a spatial database using a Geographic Information Systems (GIS) strategy, and last, we utilised the ArcGIS 10.0 geostatistical analysis module to investigate the geographic distribution of these cases.

2.1 Data acquisition

To assess the spatio-temporal CoVID-19 spread based on containment point, firstly the containment zones data has been collected from the official website from May to October 2020 [16]. In this study, the data has been collected during specific time for each month when the maximum number of containment zone was demarcated in the study area. Referring to Fig. 3, we observe that on 13th May 2020, 118 containment zones were addressed in the study area. On 30th June 2020, 715 containment zones were demarcated in the North 24 Parganas Municipalities or Urban center wise. On 5th July 2020, 921 locations were addressed under containment zones. In August, September, and October 94, 31, and 10 containment zones were recorded respectively.

Fig. 3
figure 3

Covid containment points in North 24 Parganas between (a). May 2020; (b). June 2020; (c). July 2020; (d). August 2020; (e). September 2020; to (f). October 2020

2.2 Database creation, centroid identification and distribution mapping

As far as the creation of centroid points is extremely difficult to capture the containment zone's boundaries because the entire address, Zipcode, or Ward is not available for every single data point, so we used point data based on zip code for further research. A point layer was created through geocoding by considering the Zip code for each respective address of each block with its ward number. For each month, the directional pattern and compactness and orientation details on the dispersion of the covid cases were determined using the directional distribution, specifically the standard deviation ellipse (SDE). One can use the ellipse to determine if the distribution of features is elongated and, as a result, has a specific orientation [14].

2.3 Spatial statistical analysis

NNA, Global Moran's I [17], and Local Moran’s I [18] were employed to investigate spatial autocorrelation, while Getis-Ord General G and Gi [19] were utilised for cluster and hotspot analysis. Statistical significance was defined as a p-value of less than 0.05.

The average distance between each feature's centroid and its nearest neighbours' centroid location is calculated in the NNA, which is evidence of clustering, random, or regular point patterns [20]. The recorded average distance divided by the expected average distance returns the average NNR, which can be calculated as:

$$ANN = \frac{{\overline{D}_{0} }}{{\overline{D}_{E} }}$$
(1)

where \(\overline{D}_{0}\) is the observed mean distance between each feature and its nearest neighbour

$$\overline{D}_{0} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} d_{i} }}{n}$$
(2)

where \(d_{i}\) equals the distance between feature ‘i’ and its nearest neighbouring feature,‘n’ corresponds to the total number of features, \(\overline{D}_{E}\) is the expected mean distance for the features given in a random pattern

$$\overline{D}_{E} = \frac{0.5}{{\sqrt {{\raise0.7ex\hbox{$n$} \!\mathord{\left/ {\vphantom {n A}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$A$}}} }}$$
(3)

‘A’ is the area of a minimum enclosing rectangle around all features or it’s a user-specified area value.

Clustering occurs when the index (average closest neighbour ratio) is less than 1. The pattern is uniform if the index is greater than 1.

The spatial correlation between variables was assessed using spatial autocorrelation, which matched geography and attributes similarity [21]. The following is the mathematical expression for Global Moran's I, a spatial autocorrelation indicator based on cross products:

$$I = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \mathop \sum \nolimits_{j = 1}^{n} W_{ij} \left( {x_{i} - \overline{x}} \right)\left( {x_{j} - \overline{x}} \right)}}{{\mathop \sum \nolimits_{i = 1}^{n} (x_{i} - \overline{x})}}$$
(4)

where n denotes the number of locations; xi denotes the CoVID-19 cases at each location i; \(\overline{x}\) is the average value of CoVID-19 cases in the study area; and wij denotes the variables of a spatial lag operator W (spatial weights of matrix W). In most cases, the index's relevance is determined in a scenario with a statistical distribution [22]. The result of Global Moran's I ranges from − 1 to 1, with a positive value indicating that a point is likely to be aggregated by neighboring points and a negative value indicating the reverse.

2.4 Joins count statistics for CoVID risk clustering

The type of sign for each region (e.g., continuous or binary) and how geographical interaction between areas is evaluated affect the indices used to aggregate spatial connection patterns in relatively adjacent areas [23]. Join counts are useful for determining whether area i and the nearby region form a high-risk cluster or show some other type of localised risk pattern. When both area i and its nearby places have a high-risk character, obtains a shared high-risk score.

$$J_{i} = b_{i} \mathop \sum \limits_{i = 1}^{n} w_{ij} b_{j}$$
(5)

The following formula can be used to calculate common low risk (where both area ‘i’ and the places surrounding it likely to be low risk):

$$J_{0i} = \left( {1 - b_{i} } \right)\mathop \sum \limits_{j = 1}^{n} w_{ij} \left( {1 - b_{j} } \right)$$
(6)

When defining localized risk patterns, it is possibly necessary to differentiate high-low risk pairings (bi = 1, bj = 0) from low–high risk pairings (bi = 0, bj = 1) in the situation of inconsistent risk level between area pairs when the emphasis is on area i. In these circumstances, the two local join count statistics are identical.

$$J_{10i} = b_{i} \mathop \sum \limits_{j = 1}^{n} w_{ij} \left( {1 - b_{j} } \right)$$
(7)
$$J_{01i} = \left( {1 - b_{i} } \right)\mathop \sum \limits_{j = 1}^{n} w_{ij} b_{j}$$
(8)

Wij is a binary spatial association that is determined by whether areas ‘i’ and ‘j’ are contiguous (wij = 1) or not (wij = 0). Let Ni indicate the area i's neighborhood, i.e., the set of areas adjacent to area ‘i’ (those with wij = 1), and believe that this neighborhood comprises Li areas (the overall count of surrounding region e.g., Li = \(J_{i} + J_{0i} + J_{10i} + J_{01i}\)).

3 Results

3.1 Distribution pattern of containment points

The COVID containment zones are distributed directionally in a series in Fig. 4 (from May 2020 to October 2022). The ellipse is often oriented in a west-northeast direction, indicating that for May and June 2020, the majority of cases are located in the western blocks of North 24 Parganas (Fig. 4a, b). SDE turns eastward in August 2020 and makes elongated axis (Fig. 4d). It indicates the spread in the cases from centre towards periphery. The scenario remained same till September. SDE shows shrink and develops a little axis from southwest to northeast for October 2020, showing the declining spread of containment zones.

Fig. 4
figure 4

Directional distribution of monthly covid containments during the period between (a). May 2020; (b) June 2020; (c) July 2020; (d) August 2020; (e) September 2020; to (f) October 2020

In Table 1, compare to z-score and p-value results over the same study area of North 24 Parganas: May, June, July, August, and September months p-value are less than 0.005 which demonstrates that our result of NNA is significant at a 1% level, indicates that CoVID-19 distribution is clustered at a 99% significant level. In October month, p-value is more the 0.005 at a 1% significance level, representing the dispersed pattern of CoVID-19 distribution.

Table 1 Summary of nearest neighbour ratio (NNR) of CoVID-19 distribution in North 24 Parganas

3.2 Spatial autocorrelation of containment distribution

The areal distribution of CoVID-19 within an administrative border was studied using the Global Moran's I statistics. Table 2 provides a summary of the results of the spatial autocorrelation data on monthly covid containment locations determined using Moran's I and Getis-Ord Gi* statistics. There were statistically relevant findings from the global Moran’s I test (z scores above 1.96) and it suggests spatial heterogeneity. The comprehensive findings of the Moran's I and Getis-Ord Gi* statistics, along with the p-value for each month, are displayed in Table 2. For the months of May, June, July, and August, the null hypothesis of the Global Moran's I statistics, which contends that the examined attribute is distributed randomly among the blocks, is rejected (p-value less than 0.05). In the months of May and July, the p-value is more than 0.005, showing there is less than a 5% likelihood that a clustered pattern could be the result of random choice. In June, the estimated p-value is less than 0.005 showing there is less than 1% likelihood and this clustered pattern could be the result of random choice. In the months of August and October, the pattern does not appear with any significant p-value. Moreover, in the month of September, there is less than a 10% likelihood showed the clustered pattern that could be the result of random choice.

Table 2 Summary of Global Moran's I and Getis-Ord Gi statistics of CoVID-19 distribution in North 24 Parganas

The Getis-Ord Gi* tool assesses each CoVID-infected block and contrasts the local situation with the global situation in the neighbouring blocks. Table 2 shows the values obtained using Getis-Ord Gi* statistics, z-score, and p-value. According to these findings, there was a positive spatial autocorrelation. The findings demonstrated that for the months of May, June, July, and September, all z-score values were statistically significant at a level of 0.05. Therefore, the null hypothesis might be rejected. If the underlying spatial processes were altered, the spatial distribution of high and/or low values of CoVID in the data set was spatially clustered more than expected. According to joint count statistics, Fig. 5 depicts the spatial clustering of the CoVID-19 containment zone in North 24 Parganas. High-high covid clusters have been noted over Barrackpore II and Rajarhat between the months of May and June, whereas low–high outliers have developed over Amdanga and Barasat. In July, Barrackpore I developed a high-high cluster. During the research period, the neighboring blocks of Amdanga and Barasat I persisted as low–high outliers for Covid cases. Minakhan, Basirhat II, Sandeshkhali I, Sandeshkhali II, Hingalgang, and Hasnabad, the south-eastern blocks of North 24 Parganas, remained low-low clusters throughout the research period (Fig. 5a–f).

Fig. 5
figure 5

Covid clusters in North 24 Parganas during the period between (a). May 2020; (b). June 2020; (c). July 2020; (d). August 2020; (e). September 2020; to (f). October 2020

3.3 Covid risk clustering and hotspot in North 24 Parganas

The Getis-Ord Gi* statistic is calculated for each feature in a dataset by the Hot Spot Analysis tool. The resulting z-scores and p-values show us where geographic clustering of characteristics with either high or low values occurs. Each feature is examined in relation to its neighbors. A high-value feature may or may not be a statistically significant hot spot. A feature must have a high value and be surrounded by additional features that have high values in order to be a statistically significant hot spot. When the local sum of a feature and its neighbors is proportionally compared to the total number of features, a statistically significant z-score is produced. This occurs when the local sum differs significantly from the expected local sum and when the difference is too great to be the result of random chance. A z-score is the Gi* statistic that was returned for each feature in the dataset. For statistically significant positive z-scores, the concentration of high values increases with increasing z-score (hot spot). Smaller z-scores have more intensive low value clustering for statistically significant negative z-scores (cold spot). The statistically significant hotspot for covid occurrences is shown in Fig. 6a–f. During the study period, Barrackpore I, Barrackpore II, Amdanga, Barasat I, and Barasat II formed significant covid hotspots and were estimated to be extremely vulnerable blocks that require strategic planning.

Fig. 6
figure 6

Covid Hotspot in North 24 Parganas during the period between (a). May 2020; (b). June 2020; (c). July 2020; (d). August 2020; (e). September 2020; to (f). October 2020

4 Discussion

In this study, the micro level containment zone is used to map CoVID-19 hotspots from May to October 2020 in the North 24 Parganas district of West Bengal. The study also evaluates the spatiotemporal spread of CoVID-19. Along the western border of the North 24 Parganas, the containment cases of CoVID-19 show a higher density and considerable clustering, according to the results. A significant cluster of CoVID-19 hotspots was found on the western edge of the North 24 Parganas during the peak period of CoVID-19 case records. The blocks, namely, Barrackpore I and II, Rajarhat, Barasat I and II, Basirhat II, and Habra, recorded significant clustering tendency as appeared from the Global Moran's I value. With less than a 5% likelihood that this clustered pattern could be the result of random chance, such a highly concentrated spatial pattern of containment zones may be attributed to urbanisation and the higher population density of these blocks.

The rural–urban distribution of the population plays a crucial role in the spatio-temporal distribution of CoVID-19 [15]. The western boundary of the North 24 Parganas records a faster rate of urbanisation and a maximum population density, and as a result, a higher distribution of CoVID-19 instances, due to its greater geographic and economic proximity to Kolkata City. According to Mishra, Gayen, and Haque, urbanisation results in larger population densities, where those who use outdoor toilets and drink contaminated water are more vulnerable to pandemic disease [24]. A low population density rural area of Hingalganj, however, shows no containment zone from May to October 2020. A few authors also point out that socio-demographic variables including resource distribution and proximity to urban areas were key explanatory variables in determining the occurrence of CoVID-19 [25]. Our findings show similar results. The eastern blocks of North 24 Parganas, predominantly covering rural blocks away from the major urban agglomerations, record the minimum number of containment locations from May to October 2020.

In the North 24 Parganas, the spatial distribution of CoVID-19 cases is quite unequal. Based on environmental and population characteristics that may control the covid-19 distribution, the district represents the danger of exposing the disease's spatial spread in connection to distinct blocks of North 24 Parganas district. The population density is relatively high here, which is the main source of concern, as there is a significant chance of CoVID-19 infections [26]. To analyze the influence of rural–urban pattern, space, and morphodynamic properties on Coronavirus infection, a detailed study is required to understand micro-level differences within the district space. To decode the factors influencing the geographic spread of infectious illnesses, further studies need to incorporate the socioeconomic, behavioral, environmental, topographic, and demographic factors as explanatory variables [27]. A group of scholars in Bangladesh have adopted district-level studies and used demographic, economic, meteorological, built environment, health, and facilities-related parameters to discover probable variables impacting CoVID-19 incidence rates [28]. They have used the three global models including Ordinary least squares (OLS), spatial lag model (SLM) and spatial error model (SEM), and one local (geographically weighted regression (GWR) in context to spatial regression model (SRM). A similar type of approach may be adopted further to explain the block-wise clustering of the CoVID-19 clustering of the North-24 Parganas. Inclusion of environmental and socio-economic risk factors of age, gender, language, caste, religion, occupation, infrastructure, and urban status to CoVID-case and death data, may further explain the pandemic's geographical determinants.

5 Conclusion

The study showed the patterns of geographical dispersion and CoVID-19 pandemic spread in North 24 Parganas. Spatial methods based on GIS have provided a prism through which to evaluate the extent of the pandemic's spread. We observe that the containment clusters of 2020 again noted a higher density of CoVID cases during 2022 (Fig. 7). It seems to support the investigation into Covid-19 distribution and its important geographic link. The government may be able to monitor and predict the spread of the virus in both local and large parts of the state by analyzing such geographical distribution patterns. The results of this study could be helpful to the associated organizations as they carry out a thorough analysis of the virus's spread and environmental management in the research area. Micro-level analysis is an integral part of facilitating regional cooperation to support recovery plans by establishing consistent safety and mitigation standards, pooling resources, and enhancing investment opportunities. The varied geographical impact necessitates territorialized policy responses on the grounds of fiscal, social, economic, and health as well as robust intergovernmental collaboration. This study aims to provide support at the national and subnational levels for vulnerable populations to prevent the situation from getting worse and to promote inclusivity during the recovery phase.

Fig. 7
figure 7

Containment locations of North 24 Parganas notified on February 1, 2022 and Covid hotspot of 2020