1 Introduction

Since the first case was reported in Wuhan, China in December, 2019, the Coronavirus Disease 2019 (COVID-19) has evolved into a pandemic within merely 3 months (World Health Organization 2020). By April 30, more than 3.2 million COVID-19 cases were reported in 186 countries, with more than 228,000 died from the disease (Johns Hopkins University 2020). The virus is known to transmit from person to person either directly through droplets produced by coughing, sneezing, and talking or indirectly through contaminated surfaces (Bourouiba 2020; Lai et al. 2020). The combination of high reproduction and fatality rates of COVID-19, as well as the lack of vaccine and herd immunity, forces governments around the world to issue travel restriction, business closedown, and social distancing measures in the hope to “flatten” the outbreak curve so that new cases would not overwhelm the health care system. In the United States (U.S.), the federal government declared a national emergency on March 13, 2020 and encouraged all citizens to minimize non-essential social/business activities.

A wide range of case growth rates has been observed across the 50 U.S. states since March 13. This partially reflects a non-uniform enforcement of social distancing among the states and different demographic features such as population density and age distribution. Local meteorology, however, may be an unignorable factor. It was reported that temperature, relative humidity, and wind speed influences the survival and transmission of Severe Acute Respiratory Syndrome (SARS) coronavirus, a close relative to the COVID-19 (Yuan et al. 2006; Yip et al. 2007). Laboratory studies also suggest that coronaviruses prefer cool and dry ambient conditions, where it may remain viable up to days (Chan et al. 2011; van Doremalen et al. 2013). There is a hypothesis that the pandemic will slow down in the coming summer of the northern hemisphere, much like seasonal influenza, though this hypothesis has yet been tested at the population level (Araujo and Naimi 2020).

While the U.S. territory spans multiple climate zones, this research represents one of the first epidemiological studies that investigate relationships between COVID-19 incidence and weather conditions at the county level. Our findings should catalyze further research in this area, and inform the modelling of COVID-19 outbreak for better projections and management efforts.

2 Study area, study period, and data source

We selected the top 50 most prevalent counties with COVID-19 in the U.S. for the period of March 16, 2020 to April 22, 2020 because the county level data were not available until March 16, 2020. These counties are mostly located in the eastern U.S., with 20 of them in the New York and New Jersey States. The confirmed cases of COVID-19 were downloaded from Johns Hopkins Coronavirus Resource Center (Dong et al. 2020). The original data were accumulative over days, from which we calculated daily new confirmed cases in each county. Since the case data for five counties (Bronx, Kings, New York, Queens, and Richmond) in New York State were not differentiated before April 22, 2020, their daily data were aggregated onto one record to represent the Greater New York City area. The meteorological data were based on long-term weather stations in each county, typically at the main airports, and downloaded from the Weather Underground database (https://www.wunderground.com/). Because some of them are highly correlated (e.g., temperature and dew point temperature) or not known to associate with respiratory diseases (e.g., wind speed and sea level pressure), we only selected seven measures: Maximum temperature, minimum temperature, average temperature, maximum relative humidity, minimum relative humidity, average relative humidity, and precipitation for further analyses. Lastly, we adopted confounding variables about socioeconomic status from American Community Survey’s 5-year (2014–2018) estimates, including county-specific population, age, gender, and racial compositions, and poverty level (U.S. Census Bureau 2019).

3 Statistical analyses and findings

We applied the generalized additive model (Hastie and Tibshirani 1990) to evaluate the association between meteorological factors and the incidence of COVID-19. The outcome variable was defined as the number of new confirmed cases with a log link function predicted by a combination of linear terms, nonlinear functions and an offset from the logarithm of county-level population. We proposed three models sharing the same confounding variables in terms of the day-of-the-week, county’s federal information processing standard, median age, male percentage, White percentage, Black percentage, Hispanic percentage, and poverty percentage. We also included a p-spline (Wood 2017) for calendar time to control temporal autocorrelations as well as a two-dimensional spatial function of latitude and longitude to control spatial autocorrelations. The spatial function also explained unobserved risk factors in each county, such as human activities and policies (e.g., stay-at-home order). The main difference of the three models is how meteorological measures were estimated. In Model 1, meteorological factors were modelled as linear terms. In Model 2, we alternative considered a p-spline for each meteorological measure. Each model was fitted seven times for seven meteorological measures separately to identify whether meteorological impacts on COVID-19 were linear or nonlinear by using the Akaike information criterion (AIC).

Table 1 shows that six of seven meteorological factors were significantly and linearly associated with COVID-19 according to Model 1. Temperature and precipitation measures were all negatively related to COVID-19, while relative humidity measures were positively related to COVID-19. Comparing the AIC between Model 1 and Model 2, all AICs were smaller in Model 2, suggesting nonlinearly meteorological impacts on COVID-19.

Table 1 The linear association between meteorological factors and COVID-19 from Model 1

Figure 1 shows that the percentage change of relative risk (RR %) of COVID-19 incidence had diverse variations among different meteorological factors. Comparing the three temperature measures, average temperature had the smallest AIC and so best address the association with the incidence of COVID-19. A RR % larger than 10% was more likely observed in 63.44 °F or higher, and the highest RR % was 19.33% (95% CI = 16.54, 22.20) when the average temperature increased to 67.30 °F. After that, the RR % decreased to negative when the average temperature exceeded 74.37 °F. Moreover, compared to the three relative humidity measures, minimum relative humidity had the smallest AIC and thus best address the association with the incidence of COVID-19. A higher RR % was more likely observed with a lower level of minimum relative humidity from 0 to 10%, while it was not reliable because of very limited observations in this range. The RR % appeared another surge with a higher level of minimum relative humidity over 92.61%. Furthermore, precipitation had a positive impact on COVID-19 between 1.27 and 1.74 inches, resulting in the largest increase in the RR % of COVID-19 incidence up to 33.89% (95% CI = 28.63, 39.35) with 1.55 inches of precipitation. Then, the RR % significantly declined to negative when precipitation increased over 1.77 inches.

Fig. 1
figure 1

Nonlinear influence of meteorological factors on COVID-19 incidence from Model 2

Finally, according to the smallest AIC in temperature and relative humidity measures, the average temperature and minimum relative humidity were selected to fit with precipitation in Model 3 to investigate the compound influence on COVID-19 incidence. We also included three additional smoothing functions for 1 to 3-day lag in each selected meteorological factor to take lagged effects into account. Model 3 resulted in the smallest AIC = 13537271, compared to all AICs in Model 1 and Model 2. We evaluated the overall impact of each selected meteorological factor by summing up all smoothing functions from the main effect and three lagged effects, shown in Fig. 2. In particular, the overall impact of average temperature had a clear threshold by 59.5 °F, where the overall impact turned from negative to positive. After reaching the greatest RR % (73.40%; 95% CI = 62.81, 84.66), the overall impact consistently decreased after 68.45 °F. The level of RR % was downward below 0% when the average temperature increased over 84.67 °F, while the overall impact was no longer significant. Moreover, the overall impact was higher with lower levels of minimum relative humidity, while the trend gradually went downward to reach the smallest RR % by -36.22 (95% CI = − 40.14, − 32.04) when the minimum relative humidity increased to 91.42%. Nonetheless, its overall impact returned to positive as it increased over 97.28%. Furthermore, the overall impact on COVID-19 climbed to the highest level (RR % = 232.07%; 95% CI = 199.77, 267.85) with 1.60 inches of precipitation, and then significantly reduced to negative when precipitation increased over 1.85 inches.

Fig. 2
figure 2

Overall impacts of average temperature, minimum relative humidity, and precipitation on COVID-19 incidence from Model 3

4 Discussion and conclusion

Along with the development of the COVID-19 pandemic, biomedical researchers have devoted a great effort to looking for treatments and vaccines against COVID-19 (Chen et al. 2020; Cascella et al. 2020). Unfortunately, the process of clinical trials also needs time to go through several phases to confirm that a treatment or a vaccine can efficiently work without side effects (Gouglas et al. 2018). Previous experience from SARS and Zika did not have successful treatments and vaccines because both epidemics ended before treatment and vaccine developments were complete (Lurie et al. 2020). It is alleged that approval treatments or vaccines of COVID-19 probably cannot be developed sooner than 12–18 months (Lanese 2020). Therefore, people turned to anticipate that the mitigation of the COVID-19 pandemic will happen in the summer because another similar disease, SARS, started the epidemic since February 2003, and quickly diminished in July 2003 (Dutton 2020).

Literature has revealed that a higher level of temperature and humidity can reduce the transmission of COVID-19 (Wang et al. 2020), and a preliminary research hosted by the U.S. Department of Homeland Security indicates that the coronavirus died faster in hotter and more humid conditions—despite the raw data and analytic report have not been released and peer-reviewed (U.S. Department of Homeland Security 2020). From the viewpoint of epidemiology, such evidence is far from conclusive. Some multi-city studies in China found no strong evidence to support that COVID-19 epidemic could mitigate when the weather becomes warmer (Xie and Zhu 2020; Yao et al. 2020). In our study, we first investigated the linear association between meteorological factors and COVID-19 in the U.S. We found that higher temperatures significantly reduced the risk of COVID-19, while higher relative humidities significantly increased the risk. We further verified that the nonlinear association might be better to address the impact of meteorological factors on COVID-19. While increasing temperature and relative humidity may lead to a decline of risk, it only happened within certain temperature/relative humidity ranges. The mitigation might not occur at a higher level of relative humidity approaching 100%. In the final model (Model 3), our findings did not support the hypothesis and expectation that a higher level of temperature was significantly negatively associated with COVID-19, while the RR % was significantly decreased. However, it is still too early to conclude that the pandemic will be mitigated in the summer.

A new finding which has not been fully discussed in literature is that precipitation can greatly reduce the risk of COVID-19. We used to wonder whether the variation to COVID-19 incidence was more likely explained by rainfall rather than relative humidity. Nonetheless, the correlation between precipitation and relative humidity is quite low (ρ = − 0.06–0.20). Thus, in statistical viewpoints, the two meteorological factors may not affect each other in our models. We haven not seen reliable evidence of precipitation reducing the risk of COVID-19 from literature yet, while our finding in precipitation should be explained conservatively because such a short study period can hardly represent the true profile of precipitation during this year.

To sum up, the study found that average temperature, minimum relative humidity, and precipitation can better address the impact of meteorological factors on COVID-19. However, we did not find evidence of reducing the risk of COVID-19 from the higher level of minimum relative humidity. Despite an increasing average temperature likely can reduce the risk of COVID-19, the evidence on the highest level of average temperature was still not strong enough. Precipitation can greatly reduce the risk of COVID-19, though observations of heavy precipitation are few in this study. Notice that valid data points were few for the lowest and highest ranges of each meteorological measure because of a short study period, making the results more uncertain in those ranges. Thus, further research is warranted to build the relationship from a longer study period. Our findings alert people that the fight on the pandemic is still on-going, and we human beings have no capital to relax from COVID-19 yet. Therefore, rather than expecting that the natural environment can curb coronavirus, it is more realistic to have a better self-protection during the pandemic, and be patient to wait for treatments and vaccines.