Introduction

In late December 2019, a novel coronavirus, named COVID-19, was first reported in Wuhan, Hubei Province, China (Daraei et al. 2020; Lu et al. 2020; Wu et al. 2020). Acute Respiratory Syndrome Corona Virus 2 (SARS-CoV2) (Lu et al. 2020) is the pathogenic agent of COVID-19 and most of the infected had clinical manifestations of fever and shortness of breath (Chen et al. 2020). This epidemic has caused serious demographic changes and unemployment (Bashir et al. 2020b). It has been confirmed COVID-19 can be transmitted through direct contact (human-to-human) (Chan et al. 2020). A total of 20,871,160 patients with COVID-19 had been confirmed worldwide as of August 13, 2020, and 81 countries have more than 10,000 confirmed cases (https://www.hopkinsmedicine.org/coronavirus). South America accounts for about 27% of the world’s confirmed cases, making it the region with the highest number of confirmed cases. The confirmed cases in South America until August 13, 2020 are shown in Fig. 1.

Fig. 1
figure 1

Confirmed cases of COVID-19 across South America until 13 August, 2020. Data of coronavirus disease 2019 (COVID-19) from https://www.jhu.edu/

Air pollution remains a major public health threat globally (Hashim et al. 2021). Previous results indicated that air pollution can increase the spread of diseases (Bell et al. 2004; Goings et al. 1989; Wei et al. 2019). Previous studies showed that droplets with virus can stay suspended in the air for a short time, and these particles may pose a threat of infection if they are inhaled by nearby persons. This approach makes it possible for people infected with COVID-19 to facilitate the spread of infection (Anfinrud et al. 2020; Meselson 2020). Lab experiments have demonstrated that the SARS-CoV-2 virus can survive in aerosols for days or weeks, making the virus susceptible to airborne contamination (Liu et al. 2020). Particulate matters such as PM10 and PM2.5, due to their small size, can easily penetrate into the lower respiratory tract and can carry the virus directly into the alveoli and tracheobronchial region (Qu et al. 2020). Several studies have proven that air pollutants act as a carrier to transmit virus reducing the level of immune system and therefore make human bodies more vulnerable to virus infection (Becker and Soukup 1999, Glencross et al. 2020, Xie et al. 2019, Xu et al. 2020). Air pollutants have been shown to affect the transmission and severity of respiratory viral infections including, but not limited to severe acute respiratory syndrome (SARS), the emergence of the Middle East respiratory syndrome (MERS), as well as SARS-CoV-2 (Cui et al. 2003; Domingo and Rovira 2020; Silva et al. 2014). It has been shown that air pollution is positively correlated with the mortality of SARS in China (Cui et al. 2003). The environment around us is filled with contaminants that can inadvertently expose humans to viruses (Daraei et al. 2020). Although risk factors for COVID-19 are still under investigation, it is possible that environmental factors, such as air pollution, may play a significant role in affecting the spread of the epidemic among the population.

In terms of SARS-CoV-2, multiple studies are showing the significant association between air pollution and the spread rate of the COVID-19. Several recent studies have shown that the risk is significantly higher for individuals contracting COVID-19 where they are exposed to environmental pollutants (Coccia 2021; Liu et al. 2021). Generalized additive models (GAM) showed that six air pollutants (PM2.5, PM10, SO2, CO, NO2, and O3) were significantly related to the confirmed cases in 120 cities from Jan 23 to Feb 29, 2020 in China. Empirical estimates suggested that PM2.5 is a significant factor associated with the COVID-19 pandemic in the top 10 most affected states in the USA (Bilal et al. 2021b). In Europe, the most severely affected region is the same as that possessed the highest concentrations of PM10 and PM2.5 (Martelletti and Martelletti 2020). Furthermore, most fatality cases occurred in the regions with the highest NO2 concentration (Ogen 2020). Spearman correlation analysis indicated that PM2.5, O3, and NO2 have a significant relationship with the outbreak of COVID-19 (Bilal et al. 2020). The relations were also confirmed in California, the USA, and India (Bashir et al. 2020c; Sharma et al. 2020). In South America, correlation analysis and wavelet transform coherence were used to explore the relationship between environmental pollution indicators and the spread of COVID-19. Results showed that PM10, NO2, CO, and O3 are significant factors in the fight against the COVID-19 pandemic (Bilal et al. 2021a).

The impact of air pollution on the epidemic varies from study to study. Thus, the findings have been inconsistent and there were limited compelling reasons on the shape and magnitude of those relationships. Therefore, it is necessary to explore the effect of air pollution on the spread of COVID-19. Tracking the epidemic data and the dynamic variations of these values can help to estimate the spread of this emerging pandemic (Merl et al. 2009). Here, we assemble the datasets of the spread of the COVID-19 pandemic in five regions of South America. The time-dependent reproduction number (Rt) in each area was estimated to assess the expected number of secondary cases arising from a primary case infected during the t period (Thompson et al. 2019). The objective of this work is to assess the relationship of different air pollutants on the newly confirmed cases of COVID-19. To evaluate the impact of pollutants on epidemic spread more objectively and comprehensively, two frequently used models, generalized additive models (GAM) and multiple linear regression, were both applied to each city. And the results from both models were compared to explore the impact of air pollution on the spread of the COVID-19, in addition, we also compared the differences in the results of the two models.

Materials and methods

Database of air pollutants and COVID-19 infection

Five regions from three countries in South America, including Sao Paulo, Sao Jose dos Campos, and Vitoria in Brazil, Guayaquil in Ecuador, and Bogota in Colombia, were studied in this work. Time series data of air pollution including six major air pollutants PM2.5, PM10, O3, nitrogen dioxide (NO2), sulfur dioxide (SO2), and carbon monoxide (CO) were obtained from real-time air quality index of Air Pollution in the World database (Data source: aqicn.org/data-platform/covid19/). This website uses the standard for air pollutants from the US Environmental Protection Agency (EPA). And the daily air quality index (AQI) data were then converted to mass concentrations (https://www.airnow.gov/aqi/aqi-calculator/AQICalculator|AirNow.gov). The data concerning the number of newly confirmed cases was collected directly from the National Health Department from March 28 to June 10, as it is shown in Table 1.

Table 1 The source of daily confirmed cases of COVID-19

Estimation of the time-dependent reproduction number (R t)

Rt, a time-dependent reproduction number, which can reflect the transmission of infectious diseases in the population (Cowling et al. 2010, Wallinga and Teunis 2004), was estimated with the f"EpiEstim" package in the R software. Based on the research of the Chinese CDC (Li et al. 2020b), we set an offset gamma distribution with mean of 7.5 days and standard deviation of 3.4 days. The smoothing time was set to 10 days. The epidemic grows when Rt is above 1 and the outbreak will die out once Rt stays below 1. Cross-sectional analysis was performed to examine the spatial association between air pollutants and Rt of COVID-19, and longitudinal analysis was used to examine the temporal associations of air pollutants with Rt.

Statistical analysis

To determine the relationship between each air pollutants, and the correlation between air pollutants and the transmission of COVID-19 (Rt), we used Spearman correlation to assess the associations of air pollutants with Rt with detection level α = 0.05 (bilateral). Based on the analysis of correlation, two frequently used models, multiple linear regression and generalized additive models (GAM), were both used in the study. For multiple linear regression model, the number of Rt was used as dependent variables, and the daily air pollutants were selected as independent variables. The formula used was as follows:

$$ Y=\alpha +{\beta}_1{X}_1+{\beta}_2{X}_2+\cdots +{\beta}_n{X}_n $$

In this model, the outcome variable, Y, is thought to be a linear function of a set of predictor variables, where n is the number of predictor variables, α is a numerical constant that represents an intercept. βs stands for the partial regression coefficients of X, each β reflects that how Y will change with the X, which is associated with the β when all other X variables constant (Jaccard et al. 2006). Among them, Xs stand for the parameters of air pollution that are significantly associated with Rt.

GAM, developed by Hastie and Tibshirani (Hastie and Tibshirani 1995), was also used to estimate the association between PM2.5, PM10, SO2, NO2, CO, and Rt. The fitting of GAM uses nonlinear smoothing term, the regression equation of GAM to predict a regressed variable is shown below:

$$ \mathrm{g}\left(\mathrm{E}\left(\mathrm{Y}\right)\right)=\upbeta {X}_1+\sum \limits_{i=2}^p{S}_i\left({X}_i\right) $$

where Y is the predicted values of the dependent variable, Rt; Xi represents the levels of air pollution, independent variables, and Si is the nonparametric smoothing function.

According to the different data distribution of dependent variables, different methods are used to fit the model. Popularly used distributions in GAM modeling are Normal, Gamma, and Poisson distributions (Ravindra et al. 2019). In this paper, we applied Poisson distributions to examine the moving average lag effect (7 days) of air pollutions on daily values of Rt of COVID-19 and all Poisson regression analyses were performed in R (version 3.6.2) with the “mgcv” package.

Results

Daily pollutant data

As shown in Fig. 2 and Fig. S1, the median concentration of particulate matter in Sao Jose dos Campos (PM2.5, 11.040 μg/m3, PM10, 19.440 μg/m3), O3 (0.020 ppm), and CO (6.453 ppb) in Colombia, NO2 (4.558 ppb) in Guayaquil and SO2 (10.706 ppb) in Sao Paulo were the highest within the five cities, respectively. The concentrations of other pollutants are at similar levels through these five cities. According to Spearman’s correlation coefficient, there is a positive correlation between the six pollutants, most of which are extremely significant (p<0.01), except O3, which has a negative correlation, or weak positive correlation (in Vitoria) with other pollutants in three Brazilian cities (Fig. 3A–3C). In Bogota (Fig. 3D), there are strong positive correlations (p<0.01) between each pollutant, except O3/NO2, SO2/CO. In Guayaquil (Fig. 3E), the correlation between any of the two pollutants is statistically significant (p<0.05).

Fig. 2
figure 2

Daily changes in the number of confirmed COVID-19 cases and air pollution in the selected regions. The gray areas indicate the number of daily confirmed cases. The colored lines represent the pollution changes over the corresponding time, the red line represents PM2.5, the dashed line represents PM10, the blue line represents O3, the green line represents NO2, the purple line represents SO2, and the orange line represents CO

Fig. 3
figure 3

Spearman correlation between air pollution and Rt in the five regions. The color gradient indicated Spearman’s correlation coefficients. The darker blue indicates a stronger positive correlation, and darker red indicates a stronger negative correlation. Data significance was marked by * p<0.05, ** p<0.01

Epidemiological data in the selected regions

The calculated Rt (Fig. 4) values showed a gradual decline in all the five regions, particularly in Sao Jose dos Campos, where the peak was 5.56, and then went down to 1.16 on June 10. The Rt value of Guayaquil decreased from the peak of 1.72 to 0.24. By contrast, Rt in Victoria fluctuated, and it remained above 1 until June 10, indicating that the epidemic situation in the region was still serious.

Fig. 4
figure 4

Daily estimated distributions of the effective reproduction number Rt, based on selected epidemiological data for COVID-19 with 95% confidence intervals, where the dashed line represents the threshold of Rt

Model fitting

The results of GAM and multiple linear models are shown in Fig. 5, Figs. S2-S4, and Table 2, respectively. By establishing GAM models between the pollutant factors (explanatory variables) and the Rt response variables, the smooth regression function of explanatory variables is obtained, as well as the effect diagram of influencing factors on Rt (Fig. 5 and Figs. S2-S4). The results show that there is a nonlinear relationship between Rt and each explanatory variable in Sao Paulo (Fig. 5A), and Rt value decreases gradually with the increase of PM10 concentration. Rt increases monotonically when O3 concentration is less than 0.012 ppm or between 0.014 ppm and 0.018 ppm. When SO2 concentration is less than 0.6 ppb, Rt shows a slowly decreasing trend. However, Rt increases with the elevation of SO2 when SO2 concentration is higher than 0.6 ppb. In Sao Jose dos Campos (Fig. 5B), Rt shows an increasing trend when PM10 concentration is between 15 μg/m3 and 20 μg/m3. And Rt shows a fluctuating downward trend with the increase of O3 and NO2 concentration, and a weak change with the increase of SO2 concentration. In Victoria (Fig. S2), Rt shows a fluctuating downward trend with the increase of PM10 and NO2 concentrations. When O3 concentration is less than 0.012 ppm, Rt decreases gradually; when O3 concentration is higher than 0.012 ppm, Rt changes relatively gently. Rt shows a large fluctuating change when the SO2 concentration is below 4.0 ppb and almost no significant change when it is higher than 4 ppb. Rt increases monotonically when CO concentration is higher than 0.11 ppb. In Bogota (Fig. S3), with the increase of PM10 and NO2 concentration, Rt shows a certain fluctuation, but the value does not change a lot. When the O3 concentration is less than 0.004 ppm, Rt shows a monotonically increasing trend, while it decreases as the O3 concentration rises above 0.008 ppm. When SO2 concentration is less than 0.35 ppb, Rt tends to decrease slowly, while when it is higher than this concentration, Rt increases gradually. Rt shows a fluctuating rising trend with the increase of CO concentration. When CO concentration was higher than 0.42 ppb, there is no obvious change. In Guayaquil (Fig. S4), Rt decreases slowly when the concentration of PM10 is lower than 17 μg/m3 and increases slowly when it is higher than 17 μg/m3. Rt increases when the O3 concentration is less than 0.03 ppm and decreases above 0.03 ppm.. Rt value shows a relatively slow change with the increase of SO2 concentration. The correlation between CO concentration and Rt shows a certain linear relationship, and when CO concentration increases, Rt decreases monotonically.

Fig. 5
figure 5

The results of the GAM model for the effects of air pollutants on the variation of Rt. The gray areas represent the upper and lower limits of the confidence intervals for fitting additive functions, the solid lines represent the smooth fitting curves of Rt, and the horizontal coordinates represent the measured values of the explanatory variables, ordinate represents the smooth fitting of explanatory variables to Rt ordinate values in parentheses represent estimated degrees of freedom

Table 2 Summary of the models

However, based on the results of multiple linearity (Table 3), the magnitudes of β reflect the influence of the corresponding variable, there is a positive correlation between the Rt value and O3 (β= 13.135) in Sao Paulo and a negative correlation with SO2 (β= −0.320). In Victoria, Rt was negatively correlated with NO2 (β= −0.147) and SO2 (β=−0.053). There is a negative correlation between Rt and PM10 in Bogota (β= −0.013). For Sao Jose dos Campos and Guayaquil, there is no linear correlation.

Table 3 Statistical data of the multiple linear regression equation

Discussion

In this study, we analyzed the relationship between COVID-19 infection and air pollution in five regions of South America. Our data spans a wider range of time and space and more types of pollutants than previous studies. And referred to the previous studies, two frequently used models, generalized additive models (GAM) and multiple linear regression, were both used for each city. At the same time, the results of the two models in the same region were different. According to our results, Rt responds to different air pollutants (PM2.5, PM10, O3, SO2, NO2, CO) in different regions. Although the same significant factors were not obtained using the multiple linear regression model, however, the GAM model showed that PM10 and SO2 responded significantly to Rt in all regions. Previous study shows that COVID-19 confirmed cases is significantly positively associated with PM2.5, PM10, CO, and NO2, and negatively associated with SO2 by using the GAM model (Liu et al. 2021). And research on the spread of COVID-19 and environmental quality showed that PM10, NO2, CO, and O3 are significant factors in the fight against the COVID-19 pandemic in South America (Bilal et al. 2021a). Our findings are partially consistent with these results. Currently, there are limited studies on air pollution and the spread of COVID, and conclusions are differing. There may be regional differences that could have a potential impact on the epidemic transmission, including variation in the timing and coverage of public health interventions (Dalziel et al. 2018). Due to different environmental conditions, even the impact of the same pollutant will vary. We pooled data from similar studies in the past, and summarized the types of pollutants, the study areas, the study time, and the fitting model used (Table 4). All results show that air pollutants are significantly correlated with the spread of COVID-19. Furthermore, there were differences in the results from different study areas, times of coverage, and models used.

Table 4 Comparison of correlational studies between air pollution and COVID-19 in various studies

Although currently there are some analytical findings, we offer the following limitations in our research work. Our findings do not reveal a clear effect of pollutants on the virus’s ability to spread, but our data are broader in space and time than previous studies, and more diverse than most studies. This can give an indication that the effects of pollutants on disease may not be specific, and the results of GAM and multiple linear model in individual cities cannot be directly replicated in other regions. There are shortcomings in the currently used models. The conclusions drawn from a single fitting result cannot be directly applied, and more models can be used in future research.

There are some limitations of this study. First, there may be differences in the timing of the acquisition of epidemic data across regions. Thus, we encourage further research to analyze the association of environmental indicators with COVID-19 transmission in a wider range of regions to provide more important insights. Second, there are many other models for this kind of prediction. In this paper, GAM and multiple linear models are adopted, and there may be other more appropriate models. Third, there are many other factors that contribute to outbreaks of COVID-19, such as individual behavior, government control measures, urban density, and population. In this paper, only air pollution is considered. Social intervention, as well as population immunity, may have a greater impact on the virus’s ability to spread compared to air pollution (Baker et al. 2020). Future research needs to investigate the impact of social intervention, urban density, and population.

Conclusion

Over the past few months, the South American region has become the central region of the COVID-19 pandemic, as the infection rate of COVID-19 has been increasing. The current work investigates six environmental pollutants in South America and their association with the COVID-19 pandemic in the region. According to the experiments and analysis results, this study has come to the following conclusions:(1) Rt, which can reflect the spread of COVID-19, showed a gradual decline in all the five regions. (2) PM10 and SO2 responded significantly to Rt in all selected regions. Regulators should make better monitoring of these two pollutants. (3) The association between air pollution and the spread of COVID-19 differed in varied cities with specific statuses of air pollution. Inconsistent results were obtained from GAM and multiple linear regression model for one city. For example, in Sao Jose dos Campos, GAM revealed that the four air pollutants PM10, SO2, NO2, and O3 were significantly related to the spread of COVID-19. However, the multiple linear regression model showed that air pollution is not significantly related to the spread of COVID-19. There remains a need to optimize models to assess the contribution of air pollution to the COVID-19 pandemic. As well, more regions need to be studied to reveal the association of air pollution to the spread of COVID-19. Future research should take a comprehensive approach, including consideration of epidemiological aspects, socio-economic issues, and the different lockdowns and mobility restrictions imposed by each country.