1 Introduction

The coronavirus disease 2019, abbreviated as COVID-19 is an infectious disease has been originated from Wuhan city of Hubei province in China (Lescure et al. 2020; Huang et al. 2020). Covid-19 stances for corona virus disease which has been first identified in 2019 and caused by extreme acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) (Lin et al. 2020; Cheng and Shan 2020). This virus disease rapidly circled all over the world and has been announced pandemic by the world health organization (WHO 2020a, b, c). More than 213 countries and territories have reported 14,599,613 confirmed COVID- 19 cases with 607,746 deaths after china and thereafter it has been shifted in Europe and other countries (John Hopkins University, 2020). Thereafter this virus was transmitted in the USA and Latin American countries. As for the number of patients diagnosed with the Coronavirus, the USA is at the top of the list preceded by Brazil (WHO). Currently, the highest infected and deceased rate is in USA. Millions of people have been contaminated across the world, and many thousands have already died (Worldometers 2020). COVID-19 positive cases in countries outside of China have been identified in those with no travel history to China, indicating that human-to-human transmission was taking place in these countries (Rothe et al. 2020). Apart from direct diseases, this pandemic will affect almost every human being on Earth in one way or the other (Arora and Mishra 2020). The pandemic has triggered significant global socioeconomic disruption, such as postponing and cancelling sports, educational, political, and cultural events (New York Times 2020a, b). The first confirmed case of the COVID-19 outbreak in India was registered in Thrissur district of Kerala state of south India on 30 January 2020, when a student arrived from China (India Today 2020). There after this infectious disease has been transmitted most of the states throughout India (Kaushik et al. 2020). As of July 20, 2020, India has reported 1,118,043 confirmed cases and 28,099 deaths from coronavirus infection in 31 states and union territories (COVID19 India; MoHFW 2020a, b). India is a vast country with varying climate, cultures, languages, employment and educational opportunities, culminating in distinct and complex patterns of interconnection between various areas. This country has the second largest population with 121 crore people next to China (Census of India 2011). On 22 March 2020 when cases rose, The Prime Minister of India declared a nationwide 21-day complete lockdown to reduce the effects of new coronavirus (New York Times 2020a, b). To minimize the effect and spread of the novel coronavirus, lockdown operations are conducted across the country with rigorous restrictions on transport with human mobility and economic activity. Since there are no pharmaceutical medications in place or corona virus vaccines, a high transmission rate of this virus is expected, which will pose major challenges in controlling disease spread (Khosravi 2020).

Population density is very much important since it directly affects the patterns of communication and interaction levels among individuals in a population (Hu et al. 2013). Theoretically, population density tends to closer contact and greater interaction between people, which makes them potentially hotspots for the high prevalence of emerging COVID-19 disease (Hamidi et al. 2020). The empirical evidence to support those densely populated areas transmitted contagious diseases (Glaeser 2011). It is also found that a positive and significant relationship between district-level population density and infectious diseases (Kao et al. 2012; Garrett, 2010). Several studies have shown that population density has large impact on the contagious disease through human contact (Chandra et al. 2013; Rocklöv and Sjödin 2020; Mishra et al. 2020; Jia et al. 2020; Alaniz et al. 2017). However, some researchers found no significant association between population density and degree of virulence (Chowell et al. 2008; Mills et al. 2004). Thus the role of population density was complex and elusive in the transmission and severity of highly infectious diseases. Our study intends to address this gap and examine the relationship between density and COVID-19 spread in South Indian states.

India is one of the densest countries in the world. As per the Census of India (2011), the population density of India has 382 persons per square kilometre. The south Indian states of India vary widely in terms of population density. Among these five states, Kerala state has the highest population density (Census of India 2011). This diverse population density of five states has a different impact on the COVID-19 pandemic. Demographically Kerala, Karnataka, Tamilnadu, Telangana and Andhra Pradesh have emerged as an exception over the last few decades (Census of India 2001 and 2011). These states have low birth rates and low death rates rather than other states of India (Liu et al. 2019). Population density, one of the foremost fundamental demographic attributes, could vary systematically with abstraction scale. According to the 2011 census, the population density in Kerala is 860 persons per sq.km. It is much higher than the level of 382 persons for all India. Next to Kerala is Tamilnadu state where every square kilometer is inhabited by 555 people (Census of India 2011). The remaining three states are Karnataka, Telangana and Andhra Pradesh with 319, 312 and 304 population density (Census of India 2001; Statistical Year Book 2017; Districts at a Glance, Andhra Pradesh 2018). The number of COVID-19 cases in Tamilnadu crossed 175,678 as the total infections continued its upward trend till 20th July 2020 (Tamilnadu COVID 19 Public Dashboard). Just over 4 months since its first COVID-19 case, Karnataka state has reported 69,614 corona positive cases (Karnataka COVID 19 Dashboard). In the Telangana State, the surge in COVID-19 cases continued with the total number of people affected to 46,274 persons. (Telangana Fights Corona). Coronavirus is spreading very rapidly in Andhra Pradesh despite the lockdown with a total number of positive cases is 53,724 (COVID-19 Andhra Pradesh Website). Among the all south Indian states, Kerala has reported the least number of positive cases with 13,279 infections in 14 districts (Official Kerala COVID-19 Statistics Website). Kerala and Tamilnadu have high-quality public health infrastructure among all other states of India (NITI Aayog, 2019). However, the number of cases is increasing as mentioned in Fig. 1 which is an alarming situation for these five states (COVID-19 India).

Fig. 1
figure 1

Data source: Ministry of Health & Family Welfare, GOI and prepared by authors

Trend of novel coronavirus (COVID-19) positive case in five south Indian states from 26 March to 24 July 2020.

Our study focuses on investigating the association of COVID-19 contagion and the population density for the case of five south Indian states (Kerala, Karnataka, Tamilnadu, Telangana and Andhra Pradesh) which has developed healthcare systems to fight against COVID-19. In the context of COVID-19, statistical spatial models examined the widespread infection rate in India. Model studies have also been used to understand the risk assessment of COVID-19 spread in different regions (Bherwani et al. 2020a, b; Das et al. 2020; Gupta et al. 2020). However, the correlation analysis for the different variables in the COVID-19 data series is an important aspect. Toward this direction, this paper proposes a novel approach which includes Pearson correlation coefficient, response surface methodology (RSM) and Thiessen Polygon (TM) Method. The Pearson product-moment correlation measured the strength of the association between population density and COVID-19 positive cases. In order to evaluate the significance of results, we used the RSM process in which these two variables were found to affect the response of interest (Said et al. 2014; Shirazi et al. 2020). To represent the data analysis and maps, Thiessen polygons (TM) or Voronoi polygons is carried out for better understanding (Schumann, 1998). The novelty of the present study lies in the mathematical and statistical analysis of the relationship between infected persons and demographic factor, which can explain COVID-19’s transmission dynamics in different prefectures of southern India. To the best of our knowledge, this is the first research to use RSM and TM method to address the impact of population density during COVID-19 in the world.

In the wake of COVID-19 spread, one of the most pressing issues is to gain a better understanding of the COVID-19 confirmed cases in which population density has determined different patterns in South Indian states. This study contributes to ongoing COVID-19 monitoring efforts by identifying reported cases with population density at the district level. Spatial analysis has been done in GIS environment for the evaluation and distribution of COVID-19 positive cases and the density of districts. Our current and potential analysis can be resumed every day as new data are available to detect new emerging zones and focus on areas where transmission decreases or increases.

2 Materials and methods

2.1 Data used

In this study, we made an attempt to correlate population density with the number of COVID-19 positive cases in different districts of south India. The present research work based on secondary sources of information. Data regarding daily infected cases has been collected from COVID-19 India portal (COVID19 India) and official website of the Ministry of Health and Family Welfare as of 20th July 2020 (MoHFW 2020b). Population density data of Kerala, Karnataka and Tamilnadu states has been collected from the Census of India 2011. Remaining two states, Telangana and Andhra Pradesh population density data retrieved from Statistical Year Book 2017 (Directorate of Economics and Director Statistics, Telangana) and Districts at a Glance, Andhra Pradesh 2018 (Directorate of Economics and Statistics, Andhra Pradesh), respectively. The districts have been taken as the unit of analysis. Fourteen districts from Kerala, thirty-eight districts from Tamilnadu, thirty districts from Karnataka, thirty-three districts from Telangana and thirteen districts from Andhra Pradesh have been included in the analysis.

2.2 Methodology

Statistical analysis was conducted to determine the correlation between population density and COVID-19 positive cases using the Minitab statistical software (version 19.2.0) and the district-wise data of COVID-19 infections and population density maps are produced in GIS environment using ArcGIS software (ESRI 2017). The inter-district variation are grouped either five categories, i.e., very low–low-medium–high-very high or three categories low-medium–high. Pearson correlation coefficient, response surface methodology (RSM) and Thiessen polygon method have been applied to examine the association between the number of COVID-19 infected cases and population density (Fig. 2). In the following sections, we have discussed the different methods applied in this research. The correlation coefficient has been calculated by the following equation:

Fig. 2
figure 2

Conceptual model of correlation research between COVID-19 and population density

2.2.1 Correlation coefficient

Correlation is a statistical method that determines the degree of relationship between two different variables. A correlation coefficient is a numerical measure of a statistical relationship between two variables. One of the most important features of the correlation coefficient is it can only range from –1.00 to +1.00. Therefore, normalization of the data was done in Minitab statistical. If the correlation coefficient approaches r = + 1.00 (or greater than r = + 0.50) it depicts a strong positive relationship or a high degree of relationship between the two variables. If the correlation coefficient approaches r = − 1.00 (or less than r = − 0.50), it refers that there is a strong negative relationship. If the correlation coefficient approaches r = 0.00, it indicates that there is no correlation between the two variables. After normalization of all data sets, a correlation study was carried out using the ‘Pearson Product-Moment Correlation Coefficient’ to measure the degree of relationship between two continuous variables (Kowalski 1972). The Pearson’s coefficient r was obtained using the following formula in R statistical:

$$r = \frac{n\sum XY - \sum X \cdot \sum Y}{{\sqrt {n\sum X^{2} - (\sum X)^{2} } \cdot \sqrt {n\sum Y^{2} - (\sum Y)^{2} } }}$$
(1)

where n = number of pairs of score; \(\sum X\) = sum of x scores; \(\sum Y\) = sum of y scores; \(\sum XY\) = sum of the products of paired scores; \(\sum X^{2}\) = sum of squared x scores; \(\sum Y^{2}\) = sum of squared y scores.

After computing the r value for a specified data set, a two-tailed test was carried out to check the significance of the Correlation Coefficient. Thus, the value of r and the number of pairs of scores were converted through a formula into a distribution called the ‘t’ distribution. The ‘t’ formula was deliberately used to test whether r is equal to zero. The formula used for the ‘t’ test was:

$$t = \, r/\sum \, (1 - \varvec{r}^{2} )/\left( {n - 2} \right)$$

where r = correlation coefficient and n = the number of pairs of scores.

Degrees of freedom describes the number of independent values that a statistical analysis can estimate. It was measured by N − 2 where N is the number of pairs of scores.

2.2.2 Response surface method

Response surface methodology (RSM) can be described as a collection of various statistical and mathematical techniques that are useful to develop, improve and optimize processes (Myers et al. 2016). Response surface design is a set of advanced design of experiment (DOE) techniques that are useful for a better understanding and optimizing the response. When several input factors potentially influence the performance of the process, in those situations RSM is extensively used (Carley et al. 2004). RSM includes several methods to design the experimental procedures and central composite design (CCD) is one of them. CCD when carried out with optimization, it allows screening of the parameters in a broad range along with the role of each factor (Sahin et al. 2011). In this study CCD was applied and the number of variables of this study was two which were coded and carried into the experimental phase according to the following formula:

$$x_{i} = \left( {X_{i} - \overline{{X_{i} }} } \right)/\Delta X_{i}$$
(2)

Where, xi refers to the coded level, Xi refers to the natural level for the independent variable, \(\overline{{x_{i} }}\) refers to the mean for the natural level of the independent variables and ΔXi refers to the step change value. All responses obtained after several experimental procedures were evaluated using regression techniques, for example the second order polynomial model as presented by the following equation:

$$y = \beta_{0} + \mathop \sum \limits_{i = 1}^{4} \beta_{i } X_{i } + \mathop \sum \limits_{i = 1}^{4} \mathop \sum \limits_{j = 1}^{4} \beta_{ij } X_{i} X_{j} + \mathop \sum \limits_{i = 1}^{4} \beta_{ii} X_{i}^{2}$$
(3)

Where, refers to the independent responses, 0, , , refers to the regression coefficient of the process variables for the intercept, linear, quadratic and cross product terms, respectively (Prakash Maran et al. 2017). Analysis of variance (ANOVA) was carried out to test the fitness of the model through the response by the coefficient of R-square and F-test for lack of fit. The interactions between the factors and the ANOVA was studied interpreting the P values.

2.2.3 Thiessen polygon or Voronoi polygon

The Thiessen polygon interpolation method, proposed by Poland meteorologist A. H. Thiessen is often referred to as the nearest neighbor interpolation method or Thiessen polygon interpolation method (Peng and Li 2008). Thiessen polygon is a simple and direct framework where every interpolated location has given an adjacent extrapolation (Thiessen, 1911).To establish the correlation between COVID-19 positive cases and population density, Thiessen Polygon was constructed and analysed for each state for a better understanding of interference among the regions. Area which is an inherent factor of Thiessen Polygon gets decided based on point values. Each polygon contains one feature point. The unique property of each polygon is, any location within a polygon is closer to the polygon’s point than to the point of any other polygon. Thiessen Polygons are created from a set of feature points, therefore, every individual polygon specifies an effect area around its input feature point (Wheeler et al. 2010). Therefore, Thiessen Polygon uses the basic method of analysing neighborhood and proximity (Bherwani et al. 2020a, b). In this study, Thiessen Polygons were prepared separately for population density and COVID-19 positive cases for each state for understanding the significance of correlation. The Thiessen Polygons were created in ArcGIS using specific tools where the feature points were input along with their corresponding locations.

2.3 Why South Indian states?

Health inequalities exist in India due to the unequal distribution of health services across the States. India’s healthcare infrastructure is unable to cope with the present COVID-19 crisis. The major problems are shortages in medical supplies and a failure to provide adequate testing. This study examined health infrastructure focusing on hospital beds in five south Indian states. In India, there are 18, 99,228 hospital beds and 1.13 beds per thousand which includes government and private hospitals. As can be seen in Fig. 3, this beds per thousand is very low in Andhra Pradesh and Telangana state. According to the latest Health Index report, Kerala state has arisen as the best healthcare infrastructure in India (National Health Profile 2019). Tamilnadu and Karnataka state has a poor capacity than the national average. However, Andhra Pradesh and Telangana state have a poor capacity of hospital beds per thousand persons. Overall, southern India has a better capacity to serve patients. But still, it is insufficient to fight this pandemic with the current amount of resources.

Fig. 3
figure 3

Source: Central Bureau of Health Intelligence, Ministry of Health & Family Welfare, GOI 2019 and Tabulated by authors

Beds available per thousand persons in the south Indian states.

Kerala is located within the southwestern end of the Indian landmass. Kerala is bordered to the south and east by Tamil Nadu, to the north and northeast by Karnataka. The tropical climate and the rich monsoons offer beautiful landscapes, ample bodies of water, long beaches and over forty rivers add to the charm. Popular by the name “God’s Own Country,” the location of Kerala is geographically located between 74° 27′ to 77° 37′ North latitudes and 8° 17′ to 12° 47′ East longitudes. The population in Kerala is 33,406,061 according to Provisional Population Figs. 2011 (Census of India, 2011). Kerala is home to nearly 3 percent of India’s population and its land is three times denser than the rest of India.

Tamil Nadu is India’s 11th largest state in terms of size, and the country’s third largest urbanized population. The state, with a population of 72,138,958 (Census 2011), is the seventh most populated state in the country and covers an area of 130,058 sq. km. This state is bounded by Andhra Pradesh and Karnataka on the north, Kerala on the west, the Bay of Bengal on the east and the Indian Ocean on the south (Fig. 4). Tamilnadu extends with a geographical extent between 8° 5′ and 13° 35′ North and 76° 15′ and 80° 20′ east (Census of India 2011).

Fig. 4
figure 4

Location map of the five south Indian states

Karnataka is located in the southwestern part of India. It shares its boundaries with the southwest Kerala, northwest Goa, southeast Tamilnadu and east Andhra Pradesh. In terms of population, Karnataka ranks ninth in India. According to Census 2011, Karnataka has a population of 6.11 Crores. The total geographical area of the state is 191,791 square km. Karnataka State is situated at 11°30′ to 18°30′ North latitudes and 74° to 78°30′ East longitude (Census of India, 2011).

Telangana state was carved out of the earlier Andhra Pradesh and established in the Indian Union as the twenty-ninth state with a geographic area of 1,12,077 sq. km. Telangana is found on the Deccan highland and lies within the Southern region of India. The State is surrounded by the Maharashtra state to the north, Chhattisgarh to the north, Karnataka to the west and Andhra Pradesh to the south and east. The total population of the state is about 3.50 crore as per the Census 2011. Telangana stretches from 16° 30′ to 18° 20′ northern latitudes and 77° 30′ to 79° 30′ eastern longitudes (Statistical Year Book 2017).

Andhra Pradesh state is located in the southeastern part of the Indian subcontinent (Fig. 4). It is delimited to the south by the Indian states of Tamil Nadu, to the southwest and west by Karnataka, to the northwest by Telangana and to the northeast by Odisha. The state includes a substantial portion of the Eastern Ghats and the eastern part of the Deccan plateau. The total population of the State is 49,386,799, with a geographical area of 162,975 sq. km, according to census 2011. Andhra Pradesh lies between latitude 12° 41′ to 19° 07′ N and longitude 77° to 84° 40′ E (Districts at a Glance, Andhra Pradesh 2018).

3 Results and discussion

The recent COVID-19 infections have triggered major health burdens in many states across India (Gautam and Hens 2020). COVID-19 positive cases in India are well over 1,118,043 as of 20th July 2020. It is found that Tamilnadu has the highest corona positive cases followed by Karnataka and Andhra Pradesh state. Among all five south Indian states, Kerala has the lowest positive cases and the highest recovery rate for corona virus infections. However, the mortality rate is very low in south Indian states in comparison with the country. The whole south India regulates the pandemic more deftly than the other Indian states. Cases have risen sharply in Tamil Nadu, but mortality is relatively under low. In Telangana, only 422 persons have lost their lives and Tamilnadu, another powerful COVID-19 fighter, with over 175,678 reports, is witnessing 1.45 percent fatalities which is very low in comparison with India. Our main aim is to compare all district wise COVID-19 cases with the relation of demographic density in all five states. Table 1 depicts the overall scenario of the corona virus affected south Indian states with comparison to India.

Table 1 Sate-wise breakdown of total infected, recovered and deceased cases of COVID-19 infection

3.1 Correlation between population density and COVID-19 positive cases

A Pearson correlation coefficient is conducted to explore the relationships between the population density and COVID-19 positive cases for south Indian states. The guidelines for interpreting the correlation coefficient (Ratner, 2009) is provided in Table 2. As shown in Table 3, a strong positive linear association between population density and COVID-19 positive cases in Tamilnadu [where r (36) = 0.98], Telangana [where r (31) = 0.99] and Karnataka state [where r (28) = 0.98]. However, a moderate positive correlation was observed between the population density and COVID-19 positive cases in Kerala [where r (12) = 0.62]. It is also worth noting that there was a weak positive and statistically significant association between these two variables for the state of Andhra Pradesh [where r (11) = 0.11]. It can be said that if the correlation is positive then the increment of one variable influences the increment of the other variable. The findings of this study show that the number of confirmed cases typically increased in areas with high density and thereafter areas with medium to low density. Overall, the correlation study found that South Indian states had different aspects of the disease. Therefore, a response surface method analysis was further applied to better conceptualize the dynamics that can control COVID-19 evolution in these states.

Table 2 The guidelines for interpreting the correlation coefficient
Table 3 Tracking the Association between population density and covid-19 cases using the Pearson Correlation Coefficient for south Indian states

3.2 Spatial analysis between population density and COVID-19 contagion using the Thiessen polygon method

We have conducted a spatial study in different districts of five south Indian states and found that Tamilnadu, Telangana, Karnataka and Kerala have positive relation between population density and COVID-19 positive cases. Remaining Andhra Pradesh state has significantly (moderately) and positively correlated between these two indicators.

3.2.1 Kerala state

Unlike other governments, Kerala, a small coastal state in Southern India, continues to curb COVID-19 spread through its effective decentralized healthcare system funded by local self-government. As of July 20, 2020, Kerala, a high-population density state has already recorded 13,279 confirmed COVID-19 cases. Most of the corona affected patients are mainly found in the central and southern parts of Kerala. Malappuram, Thrissur, Ernakulam and Alappuzha districts have a maximum infection rate. This state has done a remarkable job that 95% of those infected have completely recovered with a case fatality rate of 0.78% till 10th May 2020. After that, the corona virus infection started to increase in different parts of the state due to international migration (Official Kerala COVID-19 Statistics Website). However, the death rate is very low as compared to the national average of 2.51%. In order to understand the influence of population density, we performed the Thiessen polygon approach using the COVID-19 data. It is found that central Kerala has emerged as a potential hotspot of the state. However, COVID-19 positive incidents also happened with relation to the population density of different districts. Kerala government COVID-19 portal indicates that the international migration factor induced the positive cases. The TP map of Fig. 5 is clearly depicting a district wise variation of density and corona cases.

Fig. 5
figure 5

Thiessen polygon map of the spatial distribution of COVID-19 positive cases and population density in 14 districts of Kerala state

3.2.2 Tamilnadu state

Tamilnadu has recorded 175,678 corona positive cases which is the highest number in south India. According to the state health bulletin, the Chennai district alone reported 87,235 cases of corona virus which have a population density of 26,553 persons per square km. Initially, North Chennai Koyambedu market has identified as the recent hotspot of COVID-19 (The Hindu 2020). Near about 2000 cases were related to the popular market of Chennai. Later the infection rate has increased all over the state especially in northern Tamilnadu districts. These districts has medium to very high-population density with a large number of COVID -19 positive cases. However, this state also has a low mortality rate of about 1.45%. Data represented in Table 3 shows the strong positive correlation between the population density and the number of covid-19 cases. In order to understand the spatial distribution, we have shown a TP map for better representation of COVID-19 and population density. Following Fig. 6 clearly depicts increasing trend of infection rate in the northern part of Tamilnadu.

Fig. 6
figure 6

Thiessen polygon map of the spatial distribution of COVID-19 positive cases and population density in 38 districts of Tamilnadu state

3.2.3 Karnataka state

After Tamilnadu state, Karnataka has the highest number of COVID-19 cases which is about 69,614. Till 20th July, Bangalore Urban district has the highest COVID-19 cases among the state (Karnataka COVID 19 Dashboard Website). The recovery rate in the state is at 34.18 percent which is lowest and the fatality rate is 2.02 percent which is highest among all south Indian states. From Fig. 7, it is clearly found that districts of the northern Karnataka has kept the Karnataka government on its toes due to sharp surge in the number of Covid-19 cases rather than southern Karnataka. However, the Bangalore urban district is most populous and has the highest infection rate in Karnataka. However, Fig. 7 indicates that the central Karnataka is averagely infected and has a low to medium population density.

Fig. 7
figure 7

Thiessen polygon map of spatial distribution of COVID-19 positive cases and population density in 30 districts of Karnataka state

3.2.4 Telangana state

Telangana state has observed a spike in the number of COVID-19 cases on daily basis. Currently, the state has been infected by 46,274 persons and out of this 74 percent recovered and 0.91 percent got deceased. Like Kerala, this state also has a very low mortality rate which is second lowest in south India and also far below from the national average 2.51 percent. It is also observed that Hyderabad district has the highest COVID-19 infections within the state which is about 65 percent with high population density of 680 persons per sq. km. We have utilized a prospective space–time Thiessen polygon map to detect severely affected districts of COVID-19 in the Telangana. Figure 8 clearly shows that the western part of Telangana has more infections rather than whole state as these areas are concentrated with high-population density. However, the eastern part of Telangana state is less infected due to low population density. Our study has found a strong correlation between the number of COVID-19 cases and the density of different districts.

Fig. 8
figure 8

Thiessen polygon map of the spatial distribution of COVID-19 positive cases and population density in 33 districts of Telangana state

3.2.5 Andhra Pradesh state

Coronavirus is spreading in Andhra Pradesh in spite of the lockdown with 53,724 positive cases till 20th July 2020. The percentage of discharged persons and the fatality rate is 45 percent and 1.3 percent, respectively. The western and central parts of Andhra Pradesh are maximally infected among thirteen districts in the state. Among all south Indian states, this state has the lowest population density with 304 persons per square km. In Andhra Pradesh, the infection peak is mainly caused by people who have returned from foreign countries. Figure 9 shows, the spatial distribution of corona positive cases and demographic density in the state. The dark blue color shows a higher population density and yellow color high COVID-19 cases. However, some differences are apparent in south eastern prefectures where less infections are observed. The findings indicate a low association between these two variables, including that population density as a rough proxy for COVID- 19 positive cases.

Fig. 9
figure 9

Thiessen polygon map of the spatial distribution of COVID-19 positive cases and population density in 13 districts of Andhra Pradesh state

3.3 Response surface methodology for population density and COVID-19 infection

The state-wise data of population density, total population, and COVID-19 positive cases are considered in this study to analyze the south Indian states, namely- Andhra Pradesh, Karnataka, Kerala, Tamilnadu and Telangana. The parameters under study are tabulated state-wise in the following tables.

3.3.1 Andhra Pradesh state

Table 4, shows that the effect of population density and the total population in Andhra Pradesh was significant (F = 10.12, P = 0.001) but no significant interaction was observed between the input variables affecting the total number of COVID-19 positive cases with F = 3.09, P = 0.766. The response surface plots depict a good model fit R2 and adjusted R2 of 81.53% and 73.32%, respectively (see Table 5). The model fit is also confirmed with low standardized residuals, with good fits for lack-of-fit P value 0.337 with 95% confidence level. Figure 10 shows that the input data points are in close agreement with the normal distribution curve and the residuals are closely distributed at various levels of the independent variable. The impact of total population–population density have significant effect on COVID-19 infection which is shown in 2D-contour graph and 3D-surface plot (Fig. 11). Therefore, it can be inferred that the equality of variance exists between different levels of the response.

Table 4 ANOVA analysis of the response surface model for Andhra Pradesh state
Table 5 Model statistics of the developed model
Fig. 10
figure 10

Residual plots for population density and COVID-19 positive cases in Andhra Pradesh state

Fig. 11
figure 11

2-D contour graph and 3-D surface plot of COVID-19 infection versus population density and total population

3.3.2 Karnataka state

The effect of the population density and total population in Karnataka was found to be significant (F = 20.12, P = 0.001). Table 6 shows that with a close significant association between the input variables affecting the total number of COVID-19 positive cases with F = 9.23, P = 0.002. The response surface plots depict a good model fit R2 and adjusted R2 of 87.78% and 79.61% respectively (see Table 7). The model fit is also confirmed with low standardized residuals, with good fits for with lack-of-fit P-value 0.228 with 95% confidence level. Figure 12 shows that the input data points as well as the residuals are close to the normal distribution. However, 2D-contour graph and three-dimensional response surface plots demonstrate the interactions between the variables (Fig. 13). Hence, it can be said that the equality of variance exists between different levels of the response.

Table 6 ANOVA analysis of the response surface model for Karnataka state
Table 7 Model statistics of the developed model
Fig. 12
figure 12

Residual plots for population density and COVID-19 positive cases in Karnataka state

Fig. 13
figure 13

2-D contour graph and 3-D surface plot of COVID-19 infection versus population density and total population

3.3.3 Kerala state

The consequence of the population density and total population in Kerala was seen to be significant (F = 9.93, P = 0.001). Table 8 depicts that interaction between the input variables affecting the total number of COVID-19 positive cases having a significant value of F = 5.62, P = 0.032. The response surface plots clearly show a good model fit having R2 of 76.25% (see Table 9). The model fit is also confirmed with low standardized residuals, with good fits for with lack-of-fit P-value 0.043 with 95% confidence level. Figure 14 shows that the input data points are close to the normal distribution curve and the residuals are closely distributed at various levels of the independent variable. By the use of 2D-contour graph and 3D-surface plot, it was observed that significant increasing of COVID-19 cases has positive relation of population density and total population (Fig. 15). Therefore, it can be inferred that the equality of variance exists between different levels of the response.

Table 8 ANOVA analysis of the response surface model for Kerala state
Table 9 Model statistics of the developed model
Fig. 14
figure 14

Residual plots for population density and COVID-19 positive cases in Kerala state

Fig. 15
figure 15

2-D contour graph and 3-D surface plot of COVID-19 infection versus population density and total population

3.3.4 Tamilnadu state

Our study displays a significant effect of the population density and total population in Tamilnadu (F = 7.07, P = 0.001) also with a significant interaction between the input variables affecting the total number of COVID-19 positive cases with F = 15.51, P = 0.001. Table 10 The response surface plots represent a good model fit with R2 and adjusted R2 of 77.58% and 59.61%, respectively (see Table 11). The model fit is also confirmed with standardized residuals, with good fits for with lack-of-fit P-value 0.290 with 95% confidence level. Figure 16 shows that the input data points as well as the residuals are close to the normal distribution. The combined impact of total population and population density on COVID-19 positive cases and corresponding 2D- contour graph with three-dimensional response surface are illustrated in Fig. 17. Hence, it can be said that the equality of variance exists between different levels of the response.

Table 10 ANOVA analysis of the response surface model for Tamilnadu state
Table 11 Model statistics of the developed model
Fig. 16
figure 16

Residual plots for population density and COVID-19 positive cases in Tamilnadu state

Fig. 17
figure 17

2-D contour graph and 3-D surface plot of COVID-19 infection versus population density and total population

3.3.5 Telangana state

The outcome of the population density and total population in Telangana state was observed to be significant (F = 21.12, P = 0.001). Table 12 exhibited that interaction between the input variables affecting the total number of COVID-19 positive cases having a close significant value of F = 10.10, P = 0.002. The response surface plots demonstrate a good model fit having R2 and adjusted R2 of 88.68% and 79.61%, respectively (see Table 13). The model fit also confirmed with low standardized residuals, with good fits for with lack-of-fit P-value 0.031 with 95% confidence level. Figure 18 shows that the input data points are close to the normal distribution curve, the residuals are closely distributed at various levels of the independent variable. To analyze the combined impact of the population density and total population on COVID-19 infection, graphical representation of the 2D contour graph and

Table 12 ANOVA analysis of the response surface model for Telangana state
Table 13 Model statistics of the developed model
Fig. 18
figure 18

Residual plots for population density and COVID-19 positive cases in Telangana state

3-D surface plots were employed (Fig. 19). Hence, it can be said that the equality of variance exists between different levels of the response.

Fig. 19
figure 19

2-D contour graph and 3-D surface plot of COVID-19 infection versus population density and total population

While interpreting the response surfaces, it should be kept in mind that the plots are based on predictions and each predicted point has a confidence interval. If the model error is known, prediction variance can be determined (Myers et al., 1992). Usually, this uncertainty is governed by three factors like the design, the final model, and the prediction location (Mäkelä 2017). A response surface based on the half-fraction of the 95% confidence interval is illustrated in Figs. 9, 11, 13, 15 and 17 (Mäkelä and Geladi 2015). The three-dimensional response surface plot is rendered after the elimination of the non-significant response terms. Hence, the graphical representations help to build clear mental depictions of the relationships that exist between the variables (Myers et al., 1989). These plots provide an accurate geometric representation and also useful information regarding the behaviour of the system within the experimental design (Annor et al. 2010) (see Figs. 11, 13, 15, 17, 19). In this study, response surface plots were constructed using a regression equation from the fitted model while considering the least effective variables (non-significant at P < 0.05) fixed at a constant value but altering the other two variables. Our study reveals the COVID-19 positive cases is the response while the population density and total population are the two process variables with a significant level of P < 0.05.

4 Conclusion

The new coronavirus (COVID-19) has been recognized as the most serious threat to public health since the influenza pandemic of 1918 (Parmet and Rothstein 2018). It is a general assumption that density is correlated with higher transmission rates, infection levels and mortality from highly communicable diseases such as COVID-19. This study uses a systematic and recent approach to explore the connexion between population density and COVID-19 in coronavirus affected South Indian states. Hence, the present study investigated publicly available data from the COVID-19 India portal and state government portals to detect emerging space–time distribution of COVID-19 contagion for different districts. Our results from data accumulated from the five south Indian states illustrate the significance of the relationship between population density and corona positive cases. Regarding the inter-district spatial analysis from south Indian states, we observed that the population density better explains the contagion effect. However, the population density of Andhra Pradesh state did not reflect on corona transmission. The empirical findings of this research found that the positive correlation coefficients represent the effect of infection from Tamilnadu, Karnataka, Kerala and Telangana states. Confirmed case numbers are insufficient to recognize the true nature of the COVID-19 pandemic. The association of population density with COVID-19 positive cases by applying RSM model was discussed. This model was designed to compare the experimental variables to the responses. The significance test and regression analysis of these variables are calculated by predicting the response of the correlation. The purpose of this study was to identify the COVID-19 hotspots of south Indian states. However compiling the data sets which include suspected, potential and negative test numbers can significantly boost monitoring efforts and our understandings of the transmission dynamics to infectious disease like COVID-19 (Lipsitch et al. 2020; Gandhi et al. 2020). Despite the significance of our study, it’s important to mention limitations. A key limitation of the study is that, even within this sixteen week sampling duration, we are not able to disassociate the effects of each of the strict steps taken. Another constraint is that the case data contains only confirmed cases, thus it is important to note that possible and suspected cases are not included because of lack of availability. The true character of the COVID-19 pandemic will not be understood for some time. So that further research is necessary for deep understanding of COVID-19 and population density relationships in zones of different states of India.

We understand the importance of maintaining observation on emerging and active districts during pandemics, effectively ignoring preceding districts that do not present a serious threat to public health. In addition, the sharing and availability of data are very important and enables a variety of researchers to contribute to their knowledge of COVID-19 in general. Demographers can play a crucial role in mitigating the spread of diseases. Our findings indicate the planner’s role in taking steps tailored to their society to enforce social distance measures more effectively. These findings also suggest that the timeframe of the pandemic spread, based on multivariate population density analysis, may be useful for intervention planning during possible future pandemics, like a second COVID-19 outbreak. The advancement of more accurate and reliable models in terms of its scope and prediction in the battle against the COVID-19 disease will allow policymakers to better conceptualize the pandemic and implement proper and more efficient policies. Our analysis can help the district officials to identify the COVID-19 pandemic hotspots and prepare a new strategy by following the World Health Organization guidelines like ‘T3’ strategy: testing, tracing and treatment.