Introduction

The case fatality rate of patients with COVID-19 was very high in the very initial period of outbreak, approximately 30% according to a report from Wuhan (Zhou et al. 2020a, b). After a fight against this highly contagious disease for a period of time, the rate has been significantly reduced to approximately 2% (Dong et al., 2020), thanks to the more effective management of the disease. However, due to the large scale, the disease burden has still remained as one of the top public health concerns for almost all countries around the world. The mortality due to COVID-19 has been shown to be associated with socio-demographic and clinical factors including being male, older age, deprivation, ethnicity, obesity, diabetes, severe asthma and other chronic conditions (Williamson et al., 2020; Yehia et al., 2020; Lassale et al., 2020; Aggarwal et al., 2020; Holman et al., 2020; Al-Salameh et al., 2020).

The optimal status of specific nutrients is considered crucial to keep the normal function of human immune components. Specifically, based on the scientific evidence collected so far, the European Food Safety Authority deems four trace elements (selenium (Se), zinc (Zn), iron (Fe) and copper (Cu)) to be essential for the normal activity of the immune system (Scientific Opinion & EFSA, 2009a, b, c, d, 2011).

Se is an important element for the normal function of human body with anti-inflammatory, antioxidant and immune effects (Yuan et al., 2022). Se deficiency in the host is related to RNA-virus replication and virulent mutations, which may lead to more severe symptoms of the infection (Liu et al., 2021). Zn, under physiological conditions, is essential for immunoregulatory functions including cellular growth and the maturation of immune cells, particularly in the development of activation of T cells (Wintergerst et al., 2006). The antiviral property of Zn has been studied extensively in various viral infections, including hepatitis C virus, Human Immunodeficiency Virus (HIV), and coronavirus (Barocas et al., 2019). Fe metabolism and anemia are proposed to be a possible factor playing an important role in multiple organ dysfunction syndromes in COVID-19 patients (Taneri et al., 2020). Cu is involved in the functions of critical immune cells including T cells, B cells, neutrophils killer cells and macrophages. People with Cu deficiency show an excess susceptibility to infections because of the decreased number and function of these immune cells above (Raha et al., 2020).

There have been a few research articles providing preliminary clinical evidences on associations between the physiological status of these trace elements and the outcome of COVID-19 (Zhang et al., 2020; Moghaddam et al., 2020; Im et al., 2020; Jothimani et al., 2020; Taneri et al., 2020), and several review papers proposing their therapeutic perspectives for COVID-19 treatment (Raha et al., 2020; Alexander et al., 2020; Shakoor et al., 2021; Calder, 2020; Calder et al., 2020). However, no study has investigated the geographical distributions of these elements and epidemiological features of COVID-19 on a large scale during the evolving epidemic. The concentrations of these elements in plants, animals and humans are fundamentally influenced by their geographical distributions (e.g., in soil and water). Geographical distributions of trace elements in the environment have been evident to influence the health of local population (Dinh et al., 2018). In China, notably, the incidence of Keshan disease (KD) was closely associated with soil Se concentration in the KD endemic areas (Zhang et al., 2019). Furthermore, the findings of many community trials showed that Se supplementation effectively prevented the occurrence of acute and sub-acute KD (Zhou et al., 2018). In USA, environmental cadmium (Cd) was found to be associated with mortality from influenza and pneumonia in the adult population (Park et al., 2020).

Hence, we hypothesized that variation in concentrations of immune function-related trace elements in the local environment may be related to the severity of COVID-19 in that region. In specific, we aimed to assess whether there are associations between the geographical distributions of Se, Zn, Fe and Cu in surface soils, and the case fatality rate of COVID-19, using the national data of USA reported at the county level.

Materials and methods

This is an ecological study (“Modern Epidemiology” 2020), with data from different USA geographical areas (on the national scale, and at the county level) correlating two sets of variables: epidemiological data (case fatality rate) of COVID-19, and geochemical concentration data of Se, Zn, Fe and Cu. Characteristics of population, socio-demographics and residential environment in USA by county were also included for potential confounding adjustment.

Data collection

Geochemical concentrations of Se, Zn, Fe and Cu in surface soils (and stream sediments, from a depth of about 20 cm) at the county level of USA were obtained through the National Geochemical Survey, analyzed during the period between 1997 and 2009. This dataset contains chemical analyses for more than 70,000 samples covering the majority of counties (3086/3140, 98.3%) of the USA (Smith et al., 2013). The National Geochemical Survey, was conducted by the United States Geological Survey in collaboration with other federal and state government agencies, industry and academia. The survey aimed to produce a body of geochemical data using a consistent set of methods in order to provide the possible maximum level of internal consistency. The goal of the National Geochemical Survey was to analyze at least one sample in every 289 km2 area by a single set of analytical methods across the entire nation (Smith et al., 2013). Se concentration was determined by hydride-generation atomic absorption spectrometry (HG-AAS). The concentrations of Zn, Fe and Cu were measured using inductively coupled plasma-atomic emission spectrometry (ICP-AES) (Smith et al., 2013). The concentrations of Se, Zn, Fe and Cu, in this study, were reported as mg/kg (i.e., mg of substance per kg of solid sample).

Geochemical data in the conterminous states were collected, so Alaska and Hawaii were not included. Data on the following 25 counties were also not available: St. Mary (in Louisiana), Nantucket (in Massachusetts), Keweenaw (in Michigan), Hudson (in New Jersey), Major and Woodward (in Oklahoma), Camp, Delta, Franklin, Gregg, Hansford, Hopkins, Kinney, Lipscomb, Loving, Maverick, Morris, Ochiltree, Rains, Smith, Titus, Upshur and Wood (in Texas), and Lexington and Manassas Park City (in Virginia).

Epidemiological data on the case fatality rate of COVID-19 (i.e., mortality rate per cases of COVID-19) in each USA county were accessed from the COVID-19 interactive map from the University of John Hopkinson, which is a web-based dashboard to track COVID-19 in real-time on a daily basis (Dong et al., 2020). Numbers of cases and deaths in the interactive map included those confirmed and probable patients based on the reported information (Dong et al., 2020). To exclude potential bias, the analysis used the data of December 31, 2020. Thus, the analysis includes all the information up to the end of 2020 and excludes potential effects from the COVID-19 vaccination (that started from late December 2020). Six additional data points (October 8, 2020, November 5, 2020, December 3, 2020, January 28, 2021, February 25, 2021, March 25, 2021) were also used to perform validation analysis in order to question whether the findings were robust over time. A 4-week interval between adjacent data points was set among the seven study dates, constructing an observational investigation based on a total period of 24 weeks.

COVID-19 data (of December 31, 2020) was linked with geochemical data by county, in which 3057 counties were successfully mapped (3057/3086, 99.1%). Unmapped counties included Garfield, Beaver, Kane, Uintah, Millard, Grand, Cache, Iron, Juab, Rich, Piute, Weber, Morgan, Daggett, Sanpete, Carbon, Washington, Duchesne, Box Elder, Sevier, Emery and Wayne (in Utah), Clifton Forge, Dade, Bedford City and South Boston (in Virginia), Yellowstone National Park (in Montana), Dukes (in Massachusetts), and Shannon (in South Dakota). Regional COVID-19 reported in these areas were summarized in a different way other than the level of county (e.g., Southwest region of Utah).

Information on population (including size, gender, age, ethnicity, and death rate), socio-demographics (including educational level, household income, unemployment rate, and poverty rate) and residential environment (including the Rural–Urban Continuum Code), by county, was collected from the USA national official sources. The information was all complete on the studied counties. Population size, gender, age and ethnic distribution, and death rate were collected from the US Bureau of the Census (data of 2019). Education information (among adults) was gathered from the US Bureau of the Census (the 2014–18 American Community Survey 5-year average county-level estimates). Household income data was collected from the US Bureau of the Census (Small area income and poverty estimates (SAIPE) Program, 2019 data). Unemployment rate was collected from the US Bureau of Labor Statistics (Local Area Unemployment Statistics (LAUS), 2019 data). Poverty information was collected from the US Bureau of the Census (SAIPE, 2018 data). Rural–Urban Continuum Code (created by Economic Research Service, US Department of Agriculture), used in the current analysis, was the latest version, published in 2013 (“USDA ERS—Rural–Urban Continuum Codes” 2021).

Statistical analysis

At the county level, descriptive statistics on the regional concentrations of Se, Zn, Fe and Cu, and regional epidemiological data (number of cases, number of deaths, case fatality rate) of COVID-19 were presented at first. Spearman’s correlation test was initially carried out to assess the correction between regional concentrations of Se, Zn, Fe and Cu, and the regional COVID-19 case fatality rate. Spearman’s correlation test was also used for demonstrating the correlations between the four studied trace elements.

Crude odds ratios (ORs) were calculated (with 95% confidence intervals (95% CIs)) to show the unadjusted association between each potential risk factor and the case fatality rate of COVID-19, using univariable fractional (logit) outcome regression models. Multivariable fractional (logit) outcome regression modeling was then used within each domain (geochemical data of trace elements, population characteristics, socio-demographics, and residential environment) to assess the independent associations of the significant factors from the univariable analysis. Then, all significant variables in the within-domain analyses were included in a final model, with all variables entered simultaneously.

Spatial linear regression modeling was also carried out as a sensitivity analysis to assess the above association after consideration of the spatial dependency of dependent variables (i.e., geochemical concentrations) (“Stata Bookstore | Spatial Autoregressive Models Reference Manual, Release 16” 2020). The present study created the spatial-weighting matrix using inverse distances among US counties based on the dot location of each county. Information about the location of each county was collected using dot locations (latitude and longitude) based on geographic centroids. Distance between the two counties was calculated based on the dot locations.

A p value < 0.0125 (0.05/4, the Bonferroni correction for multiple testing problem), two-tailed, was considered statistically significant in all analyses, as four main research hypotheses were tested in this study (i.e., associations of Se, Zn, Fe and Cu with case fatality rate of COVID-19). All the statistical analyses were carried out using the STATA (version 15, StataCorp LLC, College Station, TX, USA).

Results

Geochemical concentrations of Se, Zn, Fe and Cu, at the 3,057 USA counties included in the current study, were extracted from the National Geochemical Survey. Summary statistics is shown in Table 1. Correlations of geochemical concentrations between these elements are displayed in Supplementary Table 1.

Table 1 Geochemical concentrations of Se, Zn, Fe and Cu at the USA county level according to the national geochemical survey

County level summary of the latest information on population characteristics, socio-demographics and residential environment index in USA is shown in Table 2.

Table 2 County level summary of population characteristics, socio-demographics and residential environment index

In the studied USA counties, up to December 31, 2020, the total cumulative number of COVID-19 cases was 19,286,619, and the total mortality from COVID-19 was 337,696 (i.e., the overall case fatality rate, 1.75%). By county, the median case fatality rate was 1.55% (interquartile range (IQR) 1.02%, 2.30%). Statistics based on other time points is shown in Supplementary Table 2.

Geographical patterns of Se, Zn, Fe and Cu concentrations, and COVID-19 case fatality rate (based on the data of December 31, 2020) across conterminous USA at the county level are visualized in Fig. 1. Notably, in the south-east region where concentrations of Se, Zn, Fe and Cu are relatively low (in blue), the case fatality rate of COVID-19 is relatively high (in yellow or orange); whereas in the west coast area where Zn, Fe and Cu are high (in yellow or orange), the case fatality rate is low (in blue), suggesting an overall pattern of inverse association, especially for Zn, Fe and Cu.

Fig. 1
figure 1

Geographical patterns of Se, Zn, Fe and Cu concentrations and COVID-19 case fatality rate across conterminous USA at the county level. COVID-19 case fatality rate, based on the data of December 31, 2020. (Color print)

Preliminary correlation analysis showed that regional geochemical concentrations were inversely associated with regional case fatality rate of COVID-19 (Se r = − 0.053, p = 0.0037; Zn r = − 0.162, p < 0.0001; Fe r = − 0.159, p < 0.0001; Cu r = − 0.164, p < 0.0001, data of December 31, 2020) at the county level. By quartiles of each geochemical concentration, case fatality rates of COVID-19 (based on December 31, 2020 and other six time points) at the county level are shown in Table 3. It was observed that the case fatality rate mainly differed between the lowest quartile (1st quartile) and the other three (2-4th) quartiles for Zn, Fe and Cu concentrations (Table 3). The inverse association between Se and case fatality rate however was not found consistently over time (Table 3). Thus, only regional Zn, Fe and Cu concentrations then were divided into two groups (1st quartile vs. 2–4th quartiles) for later statistical analysis (Fig. 2).

Table 3 Case fatality rate of COVID-19 stratified by quartiles of geochemical concentrations (Se, Zn, Fe and Cu) at the USA county level, assessed at multiple time points over 24 weeks
Fig. 2
figure 2

Box plot of county level case fatality rate of COVID-19 in the 1st quartile (lowest) of Zn, Fe or Cu concentration, compared to 2-4th quartiles as a whole. Mann–Whitney U test for the difference in case fatality rate between the two comparison groups (by Zn, median rate 1.90 vs. 1.45, p < 0.0001; by Fe, median rate 1.89 vs. 1.45, p < 0.0001; by Cu, median rate 1.84 vs. 1.48, p < 0.0001). COVID-19 case fatality rate, based on the data of December 31, 2020

In fractional regression models, all the studied variables, except for population size, were found to be associated with the case fatality rate of COVID-19 in the univariable analyses (Model 1, Table 4). After adjustment within each domain, Zn concentration (from trace element domain), white alone, age ≥ 70 years and death rate (from population information domain), poverty rate (from socio-demographic domain), and the Rural–Urban Continuum Code (from residential environment domain) were still associated (Model 2, Table 4).

Table 4 Multivariable analyses for the relationship between trace elements (Zn, Fe and Cu) and case fatality rate of COVID-19 with adjustment for other potential confounding factors

In the final model, the lowest quartile of Zn concentration (OR (95% CI), 1.13 (1.07, 1.19), compared to the rest three quartiles as a whole) remained statistically significant in association with the increased rate of case fatality of COVID-19 after confounding adjustment (Model 3, Table 4). The same statistical modeling using COVID-19 data at different time points revealed consistent results with only a slight difference in ORs (1.09–1.21, Fig. 3). Spatial regression analyses demonstrated the same findings (data not shown).

Fig. 3
figure 3

Increased case fatality rate of COVID-19 in counties with the lowest quartile of Zn concentration over time, measured by adjusted odds ratios

Discussion

This study, consisting of all cumulative case and mortality information of COVID-19, and geochemical concentrations of immune function-related trace elements (including Se, Zn, Fe and Cu) across conterminous USA, has shown there is an inverse association between regional Zn concentration and COVID-19 case fatality rate based on a county level analysis. An increased risk of death from COVID-19 was observed in counties with the lowest concentration of Zn in surface soils, and this observation was consistent over time during our assessment period (first sampling time point October 8, 2020, and last point March 25, 2021). The concentrations of trace elements in human body are fundamentally influenced by geochemical distribution via the food chain, and this has been evident to impact the health of the local population (Johnson et al., 2010; Pepper, 2013; Steffan et al., 2018). Particularly, it has been shown that low concentration of Zn in soil is one of the major factors associated with Zn deficiency in crops and humans (Alloway, 2009; Gashu et al., 2021).

In the literature, in an Indian study among COVID-19 patients at the time of hospitalization, patients with Zn deficiency were found to have higher rates of complications, acute respiratory distress syndrome, corticosteroid therapy, prolonged hospital stay and increased mortality, compared to those without Zn deficiency (Jothimani et al., 2020). A study carried out in Japan also demonstrated that prolonged Zn deficiency was associated with critical illness of COVID-19 in hospitalized patients (Yasui et al., 2020). In a Brazilian study on COVID-19 patients in the intensive care unit, the prevalence of low serum Zn level (Zn deficiency) was as high as 80%, and there was a very strong association between Zn deficiency and severe COVID-19 with an adjusted OR of 15 (Gonçalves et al., 2021). A Spanish team showed (1) in a cohort of hospitalized COVID-19 patients that lower serum levels of Zn at admission were associated with worse clinical presentation, longer time to reach stability and higher mortality; and (2) in vitro that low Zn levels favored viral expansion in SARS-CoV-2 infected cells (Vogel-González et al., 2021). Zn, selenoprotein P and age, as a composite biomarker, proved to be a good predictor of survival odds in COVID-19 (Heller et al., 2021). However, Zn therapy provided during treatment after hospital admission for COVID-19 was not associated with disease mortality, from a retrospective observational study (Yao et al., 2021). A randomized trial of ambulatory patients diagnosed with COVID-19 showed that treatment of high-dose Zn for 10 days was not beneficial to symptom reduction including fever, cough, shortness of breath and fatigue, compared to the usual care (Thomas et al., 2021).

A robust association of geochemical Se concentration with case fatality of COVID-19 was not consistently seen over the assessment period, though there seemed to be an association according to the early period (up to 3rd December). For Fe and Cu, it was suggested that these apparent associations may be due to their co-existence with Zn based on the multivariable regression analyses. In the literature, using the epidemiological pattern during the initial outbreak in China, preliminary results showed there was an inverse association between regional Se status (data on hair Se concentration) and mortality rate among COVID-19 patients (Zhang et al., 2020). A study from Germany, using serum samples to measure the total Se (by X-ray fluorescence) and selenoprotein P (by ELISA), demonstrated that among COVID-19 patients, the Se status was significantly higher in those who survived from the disease compared to those who did not, a more direct evidence suggesting Se deficiency may be a risk for COVID-19 mortality (Moghaddam et al., 2020). Clinical data on patients with COVID-19 and their matched control subjects in South Korea also arrived at a similar conclusion that a deficiency of Se may decrease the immune defenses against COVID-19 and lead to more severe disease (Im et al., 2020). With regard to Fe, it was shown that, compared to moderate patients, patients with severe COVID-19 had lower hemoglobin and red blood cell counts, and higher level of ferritin (Taneri et al., 2020). A combination test of hepcidin and serum ferritin was found to have good prediction value for COVID-19 severity (Zhou et al., 2020a, b). Hepcidin levels alone also showed as a predictor for the severity and mortality of COVID-19 in a group of hospitalized patients (Nai et al., 2021). Hyperferritinemia in COVID-19 was demonstrated to be associated with sustained inflammatory process, lung pathologies in computed tomography scans, poor patients’ preference, and mortality (Perricone et al., 2020; Sonnweber et al., 2020). We are fully aware of that under common circumstances the environmental factors, contributing to health outcomes, are usually weak or undetectable, and the non-significant results in our ecological investigation certainly would not challenge the existing clinical evidence.

The availability and quality of geochemical and COVID-19 data across USA offer a novel opportunity (linked database research) to investigate the relationship between the environmental concentrations of nutrient trace elements and the COVID-19 case fatality rate. It allows the study to use a fairly large sample size in terms of the number of COVID-19 patients, and the geographical range and diversity. On February 22, 2021, fatalities from COVID-19 surpassed half a million across USA. This prolonged epidemic unfortunately has affected people at on the whole USA national scale for many months. However, from the research perspective, it provides a generalized (i.e., less biased) setting where the epidemic situations between counties have become more similar and the treatments become more standard. In this study, the main results were based on the COVID-19 data of December 31, 2020, 10 months after the first death from COVID-19 reported in USA. This study also benefits from USA as a well-developed country with relatively good health care resources across the whole nation, compared to the developing world where regional (e.g., urban vs. rural) difference may contribute more significantly to the quality of health care and therefore the outcome of diseases. Nevertheless, in the current study, residential environment (measured by Rural–Urban Continuum Code) was considered for adjustment as a confounding factor in the final statistical modeling, as well as other county level variables including population characteristics and social-demographics. Concentrations of trace elements at the county level were the only data from the National Geochemical Survey. Additional information, such as other parameters of samples that determine the bioavailability of the analyzed elements, was not available. Many other variables (such as food source and details of health care service), which differ among counties, were also not available, which became one obvious limitation of this study.

We did not study the incidence rate (i.e., the percentage of cases among the total population) of COVID-19. As a disease of high contagion, the incidence of COVID-19 should be more related to population characteristics (e.g., age, comorbidity) and density, living environment (e.g., urban vs. rural area) and condition, ways of transportation, protection behaviors (e.g., wearing mask, washing hands regularly), local regulations on social activities, and weather and air conditions. It is also more likely to be associated with individual policies and reporting systems of regional areas. We considered that COVID-19 vaccination, which started from the late December 2020 in USA, may serve as a factor influencing the association of interest. However, based on the validation analysis that used the data up to March 25, 2021, the observed association still remained consistent.

In conclusion, this study is the first to establish the geographical relationship between distributions of trace elements and features of COVID-19 at a very large scale. The results show that, in USA counties associated with relatively low geochemical concentration of Zn in surface soils, the case fatality rate of COVID-19 in those counties is relatively high, after consideration of other influencing factors. Our findings serve as a strong epidemiological support to the current literature which has focused mainly on preliminary clinical evidence suggesting a possible association of Zn, important for maintaining normal immune function, with COVID-19 severity, suggesting Zn deficiency should be avoided. To provide more direct evidence, future research to investigate the relationship between these trace elements in food and drinking water, and COVID-19 characteristics is needed.