Abstract
Recently, the large outbreak of COVID-19 cases all over the world has whacked India with about 30,000 confirmed cases within the first 3 months of transmission. The present study used long-term climatic records of air temperature (T), rainfall (R), actual evapotranspiration (AET), solar radiation (SR), specific humidity (SH), wind speed (WS) with topographic altitude (E) and population density (PD) at the regional level to investigate the spatial association with the number of COVID-19 infections (NI). Bivariate analysis failed to find any significant relation (except SR) with the number of infected cases within 36 provinces in India. Variable Importance of Projection (VIP) through Partial Least Square (PLS) technique signified higher importance of SR, T, R and AET. However, generalized additive model fitted with the log-transformed value of input variables and applying spline smoothening to PD and E, significantly found high accuracy of prediction (R2 = 0.89), and thus well-explained complex heterogeneity among the association of regional parameters with COVID-19 cases in India. Our study suggests that comparatively hot and dry regions in lower altitude of the Indian territory are more prone to the infection by COVID-19 transmission.
Avoid common mistakes on your manuscript.
Introduction
Coronavirus disease 2019 (COVID-19) already considered as a global pandemic is rapidly spreading across the world and significantly affecting many countries (Singhal 2020; Asyary and Veruswati 2020). This outbreak of a novel coronavirus (SARS-CoV-2) disease began in December 2019 in Wuhan, Hubei Province, China (Gorbalenya 2020; Ma et al. 2020; Wu et al. 2020). By March 25, 2020, the disease had rapidly spread from Wuhan to 196 countries, located in different parts of the world (Chen et al. 2020; Xu et al. 2020). As of April 28, 2020, there have been a total of 3.12 million confirmed cases from all around the world. This contact transmissible disease has an average incubation period from 6 to 14 days (Tosepu et al. 2020). Fever, respiratory disorder, coughing and shortness of breath are some of the early symptoms; while in the acute stage, it can even lead to death (Holshue et al. 2020; Perlman 2020; Tosepu et al. 2020).
According to WHO, the first infected case in India was reported on Jan 30, 2020. Later, around March 4 onwards, it turned into a major outbreak. Till April 27, Maharashtra was the leading state with a total number of 8590 cases; while the whole country recorded a total of 29,458 cases. Social distancing is the only measure that is adopted due to the lack of vaccine. SARS-CoV-2 can be transmitted through various bio-aerosols, large droplets or direct contact with secretions similar to the influenza virus (Li et al. 2005; Qi et al. 2020). Virus transmission can be influenced by several geographical factors such as climatic conditions (temperature and humidity) and population density (PD) (Dalziel et al. 2018; Casanova et al. 2010). It was observed that the outbreak is more severe in the countries located in the mid-latitude where the temperature is considerably low in contrast to the tropical countries. Many researchers from different parts of the world tried to establish a relationship between COVID-19 transmission and various meteorological factors (Bashir et al. 2020; Prata et al. 2020; Shi et al. 2020). In a study conducted in New York, USA, using Kendall and Spearman rank correlation test, it was found that mean temperature, minimum temperature and air quality had a significant association with the COVID-19 pandemic (Bashir et al. 2020). Shi et al. (2020) reported a significant correlation between daily temperature and daily count of COVID-19 cases in China and suggested, temperature above 8–10 °C would lead to the declination of infected cases. Prata et al. (2020) concluded that a rise in 1 °C temperature would result in a decrease in the number of daily confirmed COVID-19 cases in Brazil.
In India, so far, no comprehensive study regarding the climatic influences on COVID-19 has been reported. Therefore, in this study, we investigated the correlations among climatic and topographic factors with the state-wise total number of infected cases. The main goal is to examine scientific evidence about the spread of COVID-19 cases in India based on regional factors, including PD, climatic conditions and topography.
Data and methodology
Data collection
In this study, we made an attempt to correlate different climatic and topographic variables with the number of COVID-19 infections in different states of India. We retrieved data regarding the number of COVID-19 cases in all the states of India as of April 27th, 2020 from https://www.covid19india.org/. PD data were acquired from census India website (https://www.census2011.co.in). Due to limitation of the daily ground-monitored weather data in India, we obtained long-term annual climatic data [viz. temperature, rainfall, actual evapotranspiration (AET), wind speed (WS), solar radiation (SR), and specific humidity (SH)] from TerraClimate and Worldclim websites (http://www.climatologylab.org/terraclimate.html). Shuttle radar topographic mission (SRTM) digital elevation model of 90-m spatial resolution was obtained from CGIAR website (http://srtm.csi.cgiar.org/).
Determination of climatic zones
The first part of our research was intended to understand the relative climatic conditions of different states. Hence, we implemented De Martone aridity–humidity index (De Martonne 1925). Although, this methodology is more appropriate in a smaller area (Baltas 2007), however, due to its easier calculation and fair generalization, the approach was implemented for regional classification (Ahmadi et al. 2020). Moreover, due to the easier availability of temperature and rainfall data, this method has wider popularity (Zareiee 2014). The computation of the aridity index was done by the following equation:
where IDM denotes the aridity index, P is the annual mean precipitation in mm, and T is the annual mean air temperature in °C.
Correlation and bivariate linear regression
Initially, the Pearson product moment correlation was implemented to the number of infected cases along with all the input variables to find out their inter-correlations. Later, bivariate linear regression was done to determine any existence of significance between the topo-climatic factors with the COVID-19 transmissions.
Variable importance of projection (VIP)
Partial least square regression (PLS) is a common method that reduces the predictor variables to a smaller set of uncorrelated components. Instead of original data, it runs least square regression on the reduced number of components. In general, PLS is very useful in collinear predictor variables. However, PLS is having a component called VIP that determines the relative importance of each factor (Akarachantachote et al. 2014). For easier computation of relative importance, we applied PLS in our topo-climatic data to construct a model and determine the relative importance of the variables. The VIP score of variable (j) can be calculated using the following equation:
where Waj denotes weight of the jth factor in component a and R2 (y,ta) indicates fraction of variance in y explained by the component a.
Detailed methodology of PLS and VIP can be found in the study of Wold et al. (1993) and Akarachantachote et al. (2014).
Generalized additive model (GAM)
Very recently, GAM is extensively used in numerous studies and found useful to correlate COVID-19 cases with various local meteorological parameters (Ma et al. 2020; Qi et al. 2020; Prata et al. 2020; Wu et al. 2020). In the present study, log-linear GAM was applied to analyze the state-specific associations between infected counts and regional climatic factors, topography and PD. First, the basic model was built for total infected case as the outcome of all other input parameters. Then, parameters were log-transformed as well as smooth spline function was incorporated particularly to PD and E, because for only these two variables, standard deviation exceeded mean due to extremely high heterogeneity at the regional level. Thus, the equation can be expressed as follows:
This approach also helped to explore linear and nonlinear effects of various parameters to health outcomes in terms of COVID-19 infections.
Results
Descriptive analysis
Total 29,487 confirmed cases of infections were reported till April 27, 2020 across India. Maharashtra was registered the highest number of confirmed cases (8590); while only 9 among 36 provinces (comprising 28 states and 8 union territories in India) individually registered more than 1000 such cases. PD in India varies from 17 to 11,320 considering all the states and union territories (Table 1). While considering different climatic variables, due to the broad latitudinal differences among states, high variability in values was observed (Fig. 1). The annual mean temperature varies from – 5 °C (Ladakh) to 28 °C (Puducherry) (Table 1); while, highest annual mean rainfall is observed in Meghalaya (3914 mm) and the lowest in Ladakh (164 mm). SH ranges from 0.002 to 0.015 kg kg−1. Among all these states, range of AET is observed to be very high (10.75–100.99 mm). Monthly mean WS above 10 m from the surface varies from 0.99 to 2.76 m s−1. The SR varies between 15,236 and 20,301 kj m−2 day−1. Average elevation varies from 15 to 4661 m above mean sea level.
Climatic regions and COVID-19 cases
Based on De Martonne classification (Table 2), we found six different climatic zones in entire India (e.g., semi-arid, moderate, semi-wet, wet, very wet and extremely wet) (Fig. 2). According to this climatic classification, we establish five provinces under semi-arid, two under moderate, three under semi-wet, five under wet, seven under very wet and thirteen under extremely wet categories. The spatial distribution of COVID-19 cases in India indicates that maximum transmissions occurred within the states that fall under semi-arid and wet categories. However, provinces under wet (7) and extremely wet (13) categories are likely less infected by such transmission (Fig. 2).
Bivariate correlation among variables
To understand the influence of different climatic and topographic factors, we performed bivariate correlation using the long-term climatic data and topographic elevation. Table 3 shows the Pearson correlation coefficients between each variable. We selected the number of infections as dependent variable and all the geographical parameters were correlated as independent variables. We observed a significant positive correlation between temperature and rainfall with SH and AET. A strong correlation between temperature with SR (+) and elevation (−) was also noticed (Table 3). Moreover, we found a significant positive relationship between the numbers of infections with SR (Fig. 3). Although no such significant correlation was found between number of infections and other variables, a notable positive relationship with temperature and negative relationship with rainfall were observed. Similarly, SH, AET, and altitudinal variation have a negative relation with number of infections; while WS shows a positive relation (Fig. 3). Surprisingly, we found no significant correlation between PD with the number of infections. Mention worthy, improvement in correlation was noticed for most of the variables when the variables were log-transformed and inter-correlated.
Variable importance of projection (VIP)
Figure 3i illustrates the VIP of each variable. A large value (> 1) of VIP was recorded in case of SR, rainfall, temperature and AET. Elevation, winds speed, PD, and SH were found having a VIP number lower than 1.
Implementing GAM
Using the GAM model, an attempt was made to relate the number of infected cases with all the geographical variables, which are taken into consideration for this study. Initially, we found no significant relation (R2 = 0.219) using simple linear GAM. However, log-transformed values of all variables significantly improved the performance of the model (R2 = 0.782). Using the flexibility of spline smoothening function to log-transformed value of PD (Fig. 4a) and E (Fig. 4b), the R2 value mounted on 0.895 (Fig. 4c). The parametric coefficients and approximate significance of smooth terms are enlisted in Table S4 (see supplementary files), which shows that all input parameters were able to fit in the model in such a way that all coefficients were found statistically significant at 0.05 significance level (p values < 0.05).
Log-transformed data with smoothening function to E and PD customized the model’s prediction accuracy at a very significant level. Thus, it well explored the complex non-linearity in the relation of COVID-19 infections with geographical distribution. Although simple bivariate correlation does not produce any significant relevance, the sublime outcome through GAM model suggests multiple complex parameters to take into account for further investigation in any spatial context. GAM experiment depicts that NI is negatively associated with SH and R, while positively associated with SR and T.
Discussion
The recent COVID-19 has caused significant health encumbrance in many places around the world (Ma et al. 2020). In this paper, we investigated the spatial relationship among long-term climate, topography and social factors with the counts of confirmed COVID-19 cases in India. A substantial amount of studies in different places around the world has already tried to examine if there are any correlations between COVID-19 outbreak and the existing weather or climatic conditions (Bashir et al. 2020; Sajadi et al. 2020). The prevailing meteorology (temperature, humidity, WS, etc.) significantly alters the environmental stability, therefore, it might affect the sustainability of viruses and the transmission process (Tosepu et al. 2020). According to Chen et al. (2020), COVID-19 transmission is significantly affected by surrounding air temperature and humidity conditions, agreed by Shi et al. (2020), on the occasion of major outbreak in mainland of China.
In this study, we found a positive correlation between the number of infections with long-term climatic records of temperature, WS, SR (significant) and PD. In China, Shi et al. (2020) reported a negative correlation between temperature and COVID-19 transmission on the basis of the daily weather report. However, Ma et al. (2020) reported a positive association with mortality rate and daily temperature in Wuhan, China. Subsequently, considering the global context, transmission is found higher in particular regions of subtropical countries where the surrounding air temperature is significantly low (Poole 2020).
The significant correlation between SR and COVID-19 infection in India clearly indicates that high insolation during daytime does not prevent COVID-19 transmission. However, sunlight has the ability to boost the immune system and slow down the growth of infections in human body (Cannell et al. 2006; Miller 2018; Asyary and Veruswati 2020). Asyary and Veruswati (2020) investigated the role of sunlight in COVID-19 outbreak and recovery. These workers did not observe any noticeable trend of sunlight exposure with the transmission rate, but reported a significant recovery rate under sunlight exposure.
Our study indicates a negative association between rainfall, SH, AET and elevation. A time-series study from China indicated a negative correlation between daily relative humidity and COVID-19 transmission (Qi et al. 2020). Moreover, a large number of previous epidemiological investigations reported a negative association between humidity and corona virus alike diseases (Zhang Qiang et al. 2004; Gardner et al. 2019). Thus, the findings of present research on Indian context were agreed.
We did not find any literature that is correlating the regional elevation with the COVID-19 transmission. Hence, we subsumed the average elevation for each province, since it significantly controls the climatic conditions. Our study indicated the regions in low lying elevation in India are more likely to get infected by higher COVID-19 transmission.
Occurrence of infection counts in various climatic regions suggests that the transmission rate is likely inferior in the provinces under very wet and extremely wet categories of climatic conditions, and thus, significantly stipulate lower rate of transmission in wet condition. Moreover, accounting 29.2% of total cases in India, Maharashtra has been already opened up as the prime hotspot. Moreover, 24.72% of the total cases were found in neighboring states (i.e., Gujrat, Madhya Pradesh, Goa, Chhattisgarh and Telengana). This might be a result of rapid migration before lockdown (March 25, 2020).
In the present study, we found significant outcome of predicting infected cases through GAM model accounting several geographical parameters altogether. From GAM model, we understood that hot and dry areas are more likely to be infected by COVID-19 transmission. Higher WS at micro-scale may induce the ventilation, but, our study suggests, it will not have suitable impact over regional scale. Residual plots of smooth terms (i.e., PD and E) indicate that population statistics or regional topography may not have any accountability solely; however, these are important with a combination with meteorology.
Same as any scientific investigations, our study has a significant amount of limitations: (1) we have presented only long-term climatic records to indicate the association between COVID-19 cases and prevailing circumstances. There is indeed a requirement of investigation using real-time daily weather data in different states. (2) As the disease is caused by a virus, there are lots of other factors that might be considered such as population migration, immunity power, age groups, hygiene systems, etc. Despite having limitations, this study is highly significant as it is the first report that is investigating the association of climate and COVID-19 transmission in the Indian context. This is simply a basic analysis and a large amount of data (district wise) might be incorporated for a stronger conclusion.
Conclusion
The present study aimed to understand the geographical influence on spatial distribution of COVID-19 transmission at regional level in the context of India. It is observed by several statistical analyses that climatic factors have an unavoidable influence on this viral disease in India. The heterogeneity in the spatial occurrence of infections might be attributed to local meteorology with its geographical location and population. However, no single attribute individually can well explain the nature of transmission. Positive association with SR and temperature as well as negative association with humidity and rainfall suggests that hot and arid areas in low altitude regions are required to strictly follow-up preventive measures on an emergency basis.
References
Ahmadi M, Sharifi A, Dorosti S et al (2020) Investigation of effective climatology parameters on COVID-19 outbreak in Iran. Sci Total Environ. https://doi.org/10.1016/j.scitotenv.2020.138705
Akarachantachote N, Chadcham S, Saithanu K (2014) Cutoff threshold of variable importance in projection for variable selection. Int J Pure Appl Math 94:307–322
Asyary A, Veruswati M (2020) Sunlight exposure increased COVID-19 recovery rates: a study in the central pandemic area of Indonesia. Sci Total Environ 729:139016. https://doi.org/10.1016/j.scitotenv.2020.139016
Baltas E (2007) Spatial distribution of climatic indices in northern Greece. Meteorol Appl 14:69–78
Bashir MF, Ma B, Bilal Komal B, Bashir MA, Tan D, Bashir M (2020) Correlation between climate indicators and COVID-19 pandemic in New York, USA. Sci Total Environ 728:138835
Cannell JJ, Vieth R, Umhau JC, Holick MF, Grant WB, Madronich S, Garland CF, Giovannucci E (2006) Epidemic influenza and vitamin D. Epidemiol Infect 134:1129–1140
Casanova LM, Jeon S, Rutala WA, Weber DJ, Sobsey MD (2010) Effects of air temperature and relative humidity on coronavirus survival on surfaces. Appl Environ Microbiol. 76:2712–2717
Chen B, Liang H, Yuan X, Hu Y, Xu M, Zhao Y et al (2020) Roles of meteorological conditions in COVID-19 transmission on a worldwide scale. medRxiv. https://doi.org/10.1101/2020.03.16.20037168
Dalziel BD, Kissler S, Gog JR, Vibourd C, Bjornstad ON, Metcaff JE et al (2018) Urbanization and humidity shape the intensity of influenza epidemics in U.S cities. Science 362:75–79
De Martonne E (1925) Traite de Geographie Physique: 3 tomes. Paris
Gardner EG, Kelton D, Poljak Z, Van Kerkhove M, von Dobschuetz S, Greer AL (2019) A case-crossover analysis of the impact of weather on primary cases of Middle East respiratory syndrome. BMC Infect Dis 19:113
Gorbalenya AE (2020) Severe acute respiratory syndrome-related coronavirus—the species and its viruses, a statement of the Coronavirus Study Group. BioRxiv. https://doi.org/10.1101/2020.02.07.937862
Holshue ML, DeBolt C, Lindquist S, Lofy KH, Wiesman J, Bruce H et al (2020) First case of 2019 novel coronavirus in the United States. N Engl J Med 382:929–936
Li Y, Huang X, Yu IT et al (2005) Role of air distribution in SARS transmission during the largest nosocomial outbreak in Hong Kong. Indoor Air 15:83–95
Ma Y, Zhao Y, Liu J, He X, Wang B, Fu S, Yan J, Niu J, Zhou J, Luo B (2020) Effects of temperature variation and humidity on the death of COVID-19 in Wuhan. China. Sci Total Environ 724:138226
Miller B (2018) Immune system: your best defense against viruses and bacteria from the common cold to the SARS virus. Oak Publication Sdn Bhd, Malaysia
Perlman S (2020) Another decade, another coronavirus. New Eng J Med 382:760–762
Poole L (2020) Seasonal influences on the spread of SARS-CoV-2 (COVID19), causality, and forecastabililty (3-15-2020). https://doi.org/10.2139/ssrn.3554746
Prata DN, Rodrigues W, Bermejo PH (2020) Temperature significantly changes COVID-19 transmission in (sub) tropical cities of Brazil. Sci Total Environ 729:138862
Qi H, Xiao S, Shi R, Ward MP, Chen Y, Su Q, Wang W, Wang X, Zhang Z (2020) COVID-19 transmission in mainland chaina is associated with temperature and humidity: a time-series analysis. Sci Total Environ 728:138778
Sajadi MM, Habibzadeh P, Vintzileos A, Shokouhi S, Miralles-Wilhelm F, Amoroso A (2020) Temperature and latitude analysis to predict potential spread and seasonality for COVID-19. Available at SSRN 3550308
Shi P, Dong Y, Yan H, Zhao C, Li X, Liu W, He M, Tang S, Zi S (2020) Impact of temperature on the dynamics of the COVID-19 outbreak in China. Sci Total Environ 728:138890
Singhal T (2020) A review of coronavirus disease-2019 (COVID-19). Indian J Pediatr 87:281–286. https://doi.org/10.1007/s12098-020-03263-6
Tosepu R, Gunawan J, Effendy DS, La Ahmad OAI, Lestari H, Bahar H, Asfian P (2020) Correlation between weather and Covid-19 pandemic in Jakarta, Indonesia. Sci Total Environ 725:138436
Wold S, Johansson A, Cochi M (1993) PLS-partial least squares projec-tions to latent structures. ESCOM Science Publishers, Leiden, pp 523–550
Wu F, Zhao S, Yu B, Chen Y, Wang W, Song Z, Hu Y, Tao Z, Tian J, Pei Y, Yuan M, Zhang Y, Dai F, Liu Y, Wang Q, Zheng J, Xu L, Holmes EC, Zhang Y (2020) A new coronavirus associated with human respiratory disease in China. Nature 579(7798):265–269. https://doi.org/10.1038/s41586-020-2008-3
Xu Z, Shi L, Wang Y, Zhang J, Huang L, Zhang C, Liu S, Zhao P, Liu H, Zhu L et al (2020) Pathological findings of COVID-19 associated with acute respiratory distress syndrome. Lancet Respir, Med
Zareiee AR (2014) Evaluation of changes in different climates of Iran, using De Martonne index and Mann-Kendall trend test. Nat Hazards Earth Syst Sci Discuss 2:2245–2261
Zhang Qiang YX-W, Ye D-X, Xiao F-J, Cheng Z-H (2004) Meteorological characteristics and their impacts during the SARS epidemic period. J Nanjing Institute Meteorol 19:849–855
Acknowledgements
S.D. and S.B. wish to thank the Department of Geography, Savitribai Phule Pune University, for providing necessary facilities to carry out this study. All authors are thankful to covid19india.org, census India, CGIAR, TerraClimate and Worldclim websites for providing required data used in this study.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Gupta, A., Banerjee, S. & Das, S. Significance of geographical factors to the COVID-19 outbreak in India. Model. Earth Syst. Environ. 6, 2645–2653 (2020). https://doi.org/10.1007/s40808-020-00838-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40808-020-00838-2