1 Introduction

Cumulative fatalities in the U.S. from the severe acute respiratory syndrome coronavirus 2 (SARS-COV-2) reached 1,085,000 from January 1, 2020, through the year-end of December 2022. About half of those deaths (540,000) occurred after 3/10/21 when COVID vaccinations first became generally available.

Figure 1 shows that the weekly mean cumulative rate of fully vaccinated people in the U.S. increased from 8 percent on 3/10/21 to 69 percent on 12/28/22. During that 21-month period, the total number of weekly COVID fatalities experienced two peaks during the spread of the highly contagious Delta and Omicron variants. Following the Omicron peak, the weekly average COVID fatalities dropped sharply from a high of 17,701 during the week of 2/22/22 to 2,505 only two months later on 4/27/22. For the rest of 2022, the weekly average death rates hovered at relatively low levels in a narrow range between 2000 and 4000.

Fig. 1
figure 1

Percent fully vaccinated versus total weekly COVID deaths in the U.S

While one might argue that the relatively low number of weekly COVID fatalities after 4/10/22 was the result of increasing cumulative vaccination rates, the sharp increases in COVID deaths during the Delta and Omicron outbreaks occurred after mean vaccination rates exceeded 50 percent.

A number of academic papers have studied the impact of vaccinations on COVID death rates. Few, however, have investigated differences in COVID deaths across states and how those differences can be used to estimate the efficacy of COVID vaccinations.

A study that is similar to analyzing differences across states examines diverging patterns of COVID-19 cases in seven countries with high vaccination rates (Bukhari et al., 2021). The study concludes somewhat ambiguously that “the number of cases and deaths have declined significantly (with vaccinations ≥ 50%), whereas in others they have increased compared to pre-vaccination levels” (Bukhari et al., 2021, p. 1). More problematic is the fact that the period of testing in this paper ends on May 30, 2021, and it does not account for covariates that could bias the estimates.

A study that compares U.S. COVID mortality with the ten most- and least-vaccinated states, as well as 20 OECD countries, concludes that the U.S. would have averted 122,304 deaths if COVID-19 mortality matched that of the ten most-vaccinated states (Bilinski et al., 2023). This finding was based on mortality rates over the Delta and Omicron periods from 6/27/21 to 3/26/22. Not only does the aggregation of the ten most-vaccinated states as a comparison group constrain the sample, but the initial period tested starts on 6/27/21 when the average vaccination rate had already reached 46 percent. The sample period ends on 3/26/22 and therefore excludes any empirical findings relating to the 108,000 deaths that occurred after that date.

Robert J. Barro uses differences across states’ vaccination rates and instrumental variables to analyze vaccine efficacy. He argues, “If vaccinations are effective at reducing infections and deaths, these differences should map into differences in COVID-related deaths, hospitalizations, and cases.” (Barro, 2022). Barro divides his sample into five periods over the 3/19/21 to 5/22/22 period. It uses cross-sectional regressions to measure the effects of vaccinations on COVID-related outcomes. He concludes that “one expected life saved requires 248 additional doses.” No estimation is given for the total number of lives saved during the period tested. The period tested ended on 5/22/22, after which 85,000 more COVID deaths occurred during the remainder of the year. Barro’s methodology does not directly include an instrumental variable representing chronic disease to hold differences across states in this variable constant. It might be argued that his use of a life expectancy variable could serve as a proxy for chronic disease. But as will be shown later in this study, life expectancy is an inferior measure as compared to directly including a chronic disease variable in explaining COVID death rates.

In the study to follow, the efficacy of mean vaccinations across states from 3/10/21 to 12/28/22 will be examined. A regression model similar to those undertaken by (Doti 2021a, b) will be used to measure the efficacy of vaccinations in reducing COVID mortality. The study will hold constant other covariants such as population, density, income, stringency, age, chronic disease, and differences in racial/ethnic composition across states. In addition, the study will estimate the impact of each state’s mean vaccination rate on its mean death rate, thereby allowing for an estimation of the total number of lives saved as a result of vaccinations.

A particularly important aspect of this study is the inclusion of a chronic disease variable into the estimation. No known study on the efficacy of COVID vaccinations includes chronic disease as an explanatory variable. It will be shown that including such a variable in regression tests results in more accurate estimates of the efficacy of vaccination in reducing COVID death rates. The present study will also show that excluding chronic disease as an explanatory variable results in unreliably high estimates of lives saved due to COVID vaccinations.

2 Theoretical model

Although mean weekly vaccination rates in the U.S. reached asymptotic peaks of 70 percent by year-end 2022, Fig. 2 shows that there was a relatively wide range in statewide cumulative vaccination rates, with Rhode Island registering the highest at 89.6 percent and Wyoming the lowest at 53.0 percent as of year-end 2022. In comparing weekly average statewide vaccination rates over the 3/10/21 to 12/28/22 period, Vermont was highest at 70.9, and Alabama was lowest at 43.3 (See Table 1).

Fig. 2
figure 2

Comparison of vaccination rates

Table 1 Comparison of mean vaccination rates (vr) and cumulative covid death rates (dr) by state from 3/10/21 to 12/28/22

These wide differences in statewide averages suggest a potential theoretical approach in measuring the benefits of higher cumulative vaccination rates.

If vaccinations reduce COVID death rates, the relationship between mean vaccination rates by state, vrs, will be inversely related to the death rate by state, drs, as shown in Fig. 3.

Fig. 3
figure 3

Relationship between vaccination rates and cumulative COVID death rates by state

An example of the impact of vaccinations in saving lives is depicted in Fig. 3 by comparing the state with the lowest mean vaccination rate during the 3/10/21 to 12/28/22 period (Alabama) with that of the highest (Vermont). The hypothetical efficacy of a higher vaccination rate is depicted in Fig. 3 as the difference in death rates for those two states, drA – drV = Δ dr. The other 48 states whose vrs range between vrA and vrV would theoretically have death rates (drs) between drA and drV.

The vrs and drs for all 50 states are shown in both alpha and rank order in Table 1. These values conform approximately to a linear trendline, as shown in Fig. 4, with wide differences around that trendline. Notice that the placement of the intercepts for Alabama’s and Vermont’s mean vaccination rates (vrA and vrV) and COVID cumulative death rates (drA and drV) are similar to those hypothesized in Fig. 3.

Fig. 4
figure 4

Relationship between mean vaccination rates and cumulative COVID death rates by state from 3/10/21 to 12/28/22

Although the trendline shown in Fig. 3 suggests an inverse relationship between mean vaccination rates and cumulative COVID death rates by state, it is possible that other explanatory variables that are related to vaccination rates in a collinear way may lead to a misleading characterization of the efficacy of COVID vaccinations. An empirical model presented in the following section will address that issue by examining a wider set of explanatory variables in addition to vaccination rates that might significantly influence COVID death rates. It will be shown that chronic disease, in particular, has a significant impact on COVID deaths, an impact that needs to be held constant in order to derive unbiased estimates of the efficacy of vaccinations on COVID mortality.

3 Empirical model

In order to measure the impact of differences across states in vaccination rates on COVID-19 death rates, it will be necessary to hold constant other variables that may influence COVID-19 deaths as well as define more precisely the variables to be used in formulating the empirical tests.

Cumulative confirmed COVID-19 deaths per 100,000 in state population during the 3/10/21 to 12/28/22 period serves as the dependent variable in the model. A death is defined as a person who meets the clinical and epidemiological criteria for a SARS-CoV-2 death.

The structural form of the model is shown below in Eq. (1).

$$ {\text{C}}_{{{\text{i}},{\text{t}}}} = {\text{ b}}_{0} + {\text{ b}}_{{1}} ({\text{x}}_{{1,{\text{i}}}} ) \, + {\text{ b}}_{{2}} ({\text{x}}_{{2,{\text{i}}}} ) \, + \, \ldots \, + {\text{ b}}_{{\text{n}}} \left( {{\text{x}}_{{{\text{n}},{\text{i}}}} } \right) $$
(1)

where Ci,t is the cumulative COVID-19 death rate per 100,000 in state i at the end of some period t. \({\mathrm{x}}_{1}\), , xn = 1, …, n are independent variables in state i. b0, b1, …, bn are parameters to be estimated.

Note: Display of error terms are suppressed.

In order to control and test for the factors that explain the cumulative COVID-19 death rate by state during some time interval t, the following variables shown below in Eq. (2) were selected.

$$ \begin{aligned} {\text{Death}}_{{{\text{i,t}}}} = &\, {\mkern 1mu} {\text{b}}_{0} {\text{ + b}}_{{\text{v}}} {\text{ vaccine}}_{{\text{i}}} + \sum _{{{\text{d = 1}}}}^{2} {\text{b}}_{{{\text{d,t}}}} {\mkern 1mu} {\text{density}}_{{\text{i}}} {\text{ + }}\sum _{{{\text{y = 1}}}}^{2} {\text{b}}_{{{\text{y,t}}}} {\mkern 1mu} {\text{income}}_{{\text{t}}} \\ &+ {\text{b}}_{{{\text{s,t}}}} {\mkern 1mu} {\text{stringency + b}}_{{{\text{a,t}}}} {\mkern 1mu} {\text{age 65 }} + \sum _{{{\text{h = 1}}}}^{3} {\text{b}}_{{{\text{h,t}}}} {\mkern 1mu} {\text{health }} \\ & + \sum _{{{\text{r = 1}}}}^{4} {\text{b}}_{{{\text{r,t}}}} {\text{racial/ethnic}} \\ \end{aligned} $$
(2)

where deathi is the cumulative COVID-19 death rate per 100,000 in state i during some period t. bo, bv, …… br,t, are parameters to be estimated.

The definition and descriptive statistics for the dependent and independent variables are as shown in Table 2.

Table 2 Dependent and independent variables used in the study

4 Impact of vaccination and other variables on COVID-19 death rates

Using a regression model, the explanatory variables are added in groupings from I to VII, as shown in Table 2. The regression results are presented in Regressions 1 to 8, Table 3. Note that explanatory variables were removed if not significant at the p < 0.10 level (one-tailed), and the “best” fit regression, Regression (8), Tables 3, is shown in the last column.

Table 3 Regression results, dependent variable definition: mean cumulative death rate (Covid deaths per 100,000 people by state) from 3-10-21 to 12-28-22,

4.1 Vaccination variable, vaccine

The estimated coefficient of − 2.28 for the vaccination coefficient shown in Table 3, Regression 8 suggests that after holding other explanatory variables constant, a state’s COVID-19 death rate decreases by 2.28 deaths per 100,000 in a state’s population for every increase of 1 percent in a state’s average vaccine rate. The measured t statistic of − 3.75 for vaccine in Regression 8, Table 3, is highly significant at p < 0.01 level (one-tailed).

The empirical findings of Regression 8 can be used to estimate the reduction in COVID deaths as a result of each state’s mean vaccination rate during the 3/10/21 to 12/28/22 period. This simulation that is based on a counterfactual scenario of no vaccination is shown in Table 4. For example, Alabama had a death rate of 209.5, with a mean vaccination rate of 43.3. If, as shown in Column 4, Alabama had a vaccination rate of zero, its corresponding death rate would have been 308.2. That calculation is based on the estimated coefficient of − 2.28 for the vaccination rate, vr, as shown in Table 3, Regression 8.

Table 4 The impact of vaccinations in reducing Covid deaths

In order to convert the reduction of 98.7 in Alabama’s death rate in Column 5 to a reduction in the number of deaths, it is necessary, as shown in Column 7, to multiply 98.7 by its population divided by 100,000. The resulting estimate of 5,010 fewer deaths is presented in Column 8. To express that number as a percentage of Alabama’s population, the ratio of the reduction in deaths (Column 8) is divided by its population (Column 6) and then multiplied by 100, as shown in Column 9. The resulting estimate of 0.099 suggests that Alabama’s COVID death rate was reduced by about a tenth of a percent of its population, given its mean vaccination rate of 43.3 percent.

That estimate can be compared to Vermont, the state with the highest vaccination rate of 70.9 percent. As shown in Column 9 of Table 4, Vermont’s high rate of vaccination resulted in a COVID death rate that was reduced by 0.162 percent, considerably higher than Alabama’s estimated reduction of 0.099 percent.

For all 50 states, the estimated reduction in COVID deaths resulting from vaccinations is estimated at 427,000 or 0.128 percent of the nation’s population. Since the actual number of COVID deaths during the 3/10/21 to 12/28/22 period was 577,000, the estimated reduction of 427,000 in COVID deaths suggests that had vaccinations not been available, COVID deaths in the U.S. during that period would have reached about 1 million (Actual COVID deaths of 577,000 plus reduction in COVID deaths of 427,000) or 0.301 percent of its population. That compares to an actual percentage of 0.173 deaths as a percent of the nation’s population. The decline in the nation’s death rate of 0.128 (0.301 less 0.173) is the value shown in the total row of Table 4.

Table 5 shows a listing of states in rank order of the reduction in COVID deaths as compared to each state’s mean vaccination rate. Figure 5 represents the data shown in Table 5 arranged in quintiles from the 10 states with the highest mean vaccination rate down to the 10 states with the lowest vaccination rates. For example, the l0 states with the highest vaccination rate had a mean vaccination rate of 67.3 percent and a mean reduction in the death rate of 0.153. Those 10 states with the lowest mean vaccination rate (45.8 percent) experienced a 0.104 reduction in its mean death rate.

Table 5 A listing of states in rank order of the reduction in Covid deaths as a percentage of each state’s population
Fig. 5
figure 5

Mean Reduction in Deaths as a Percent of Mean Population

4.2 Density variables, density and sdensity

In addition to a state’s population density, a super density (sdensity) variable was added to the regression tests. Since density is a measure of a state’s population divided by its total geographical area, it does not capture the impact of those states where a highly populated metropolitan area, like New York City, exhibits extremely high density. That impact can be captured by a super density variable where all cities in the nation with a population of 300,000 or more and had a population density of at least 10,000 people per square mile were identified and measured as a ratio of each state’s total population. The resulting ratio, in turn, was multiplied by the density of the metropolitan areas that met the selection criteria described above. In the structural form of the model, this super density variable (sdensity) is given by

$$ {\text{Sdensity}}_{{{\text{i,t}}}} = \sum _{{{\text{k}} = 1}}^{{\text{n}_{{\text{1}}} }} {\text{p}}_{{{\text{k,i}}}} /{\text{p}}_{{{\text{i,t}}}} *{\text{density}}_{{{\text{i,t}}}} $$

where pk,i is population of the kth city in state i with a population > 300,000 and density > 10,000 per mile2. ni is number of cities in state i with population > 300,000 and density> 10,000 per mile2. Pi,t is population of state i as of some period t. densityi,t is density of state i as of some period t.

Although the density and super density variables were highly significant in earlier tests conducted during the pre-vaccination period from 4/1/20 to 12/1/20 (Doti 2021a), they were not significant during the 3/10/21 to 12/28/22 period covered in the current study (See Regression 2, Table 3).

These findings suggest that while COVID hit dense states and highly dense urban areas particularly hard during the early stages of the pandemic in 2020, that impact fell away over time and was no longer significant during the time that vaccinations became available and were being administered.

4.3 Income variables, py and poverty

Because the personal income variable, py, did not pass the p < 0.10 significance test in Regression 3, Table 3, it was dropped from Regression 4. In sharp contrast, the poverty rate variable, poverty, was significant in Regression 3–5, Table 3. But it too was dropped in Regression 6 when the chronic disease variable, chronic, was added to Regression 6, Table 3. This will be explained in more detail in Sect. 4.6.

4.4 Stringency variable, stringency

The efficacy of a state’s governmental regulations that impose mandates in order to control the spread of COVID is measured by the Oxford daily governmental stringency index (stringency). This index measures the stringency of statewide governmental mandates on a daily basis using a scale from 1 to 100 for eleven different types of governmental responses. The index measure used in this study was derived by calculating an average stringency index from the weekly averages for each state during the 3/10/21 to 12/28/22 period. The derivation is given by

$${\text{stringency}}_{\text{i,t}}= \text{ } \sum _{{{\text{w}} = 1}}^{{\text{n} }}{\mathrm{stringency}}_{\mathrm{i},\mathrm{w}} / \, {\mathrm{n}}_{\mathrm{t}}$$

where stringencyi,t is the mean stringency index in state i as of some period t, stringencyi,w is the stringency index in state i as of a particular week, w, and nt is the number of weeks during period t.

In a previous study, the stringency variable was found to be highly significant in reducing COVID death and case rates throughout 2020 and particularly during the 7/1/20 to 12/31/20 period (Doti, 2021a). In fact, of all the variables tested in that study, stringency was the only explanatory variable during the second half of 2020 that had p < 0.05 or higher level of significance.

In the current study, however, the stringency variable exhibited no significance in reducing statewide death rates after the age65 variable was added in Regression 5, Table 3. A possible explanation for the lack of significance of the stringency variable in reducing COVID deaths during the 3/10/21 to 12/28/22 period is that states, on average, reduced their use of policy intervention, thereby leading to a sharp decline in the Oxford stringency score. This can be seen in Fig. 6, which shows the mean weekly stringency score declined from 42.8 during the 1/8/20 to 3/10/21 period to 20.1 during the 3/10/21 to 12/28/22 period.

Fig. 6
figure 6

Mean weekly stringency for the US

Figure 7 shows that the steady decline in the mean weekly stringency score during the 3/10/21 to 12/28/22 period occurred during a time when the mean death rate decreased from 218.0 at the beginning of the period to 50.5 by its end. While that may explain the accompanying decline in the efficacy of the stringency variable in reducing death rates during that period, stringency continued its steady decline even during the periods when the COVID death rate spiked as a result of the Delta and Omicron outbreaks.

Fig. 7
figure 7

A comparison of stringency and weekly COVID death rates per 100,000 in the U.S. from 3/10/21 to 12/28/22

It might be argued that statewide differences in stringency were effective during the period when the Delta and Omicron surges occurred. To test for that possibility, the time interval for calculating each state’s mean death rate and mean stringency rate was changed to that period when death rates increased above 100 and then fell back below 100. That period is from 8/8/21 to 3/30/22. As shown in Fig. 8, no discernable relationship appears to exist between mean stringency and death rates during that period.

Fig. 8
figure 8

Relationship between mean stringency and cumulative COVID death rates by state during the Delta and Omicron surges (8/8/21 to 3/30/22)

That finding is supported in a reestimation of Regression 6, Table 3, over the 8/8/21 to 3/30/22 period (detailed regression results not reported here). The estimated coefficient for stringency continued to be insignificant (t-statistic = − 0.43).

Finally, regression tests were conducted over the 3/10/21 to 7/14/21 period when the mean stringency rates, while falling, were still above a score of 20 (See Fig. 6). As in the regression tests over the longer 3/10/21 to 12/28/22 period, the coefficient for the stringency variable was insignificant (p value = 0.86).

4.5 Age variable, age65

As shown in Fig. 9, more than 75.2 percent of COVID deaths occurred over the age of 65, even though that age cohort represents only 17 percent of the total U.S. population. It is not surprising, therefore, that the coefficient for the age65 variable in Eq. (2), Table 3 was highly significant. The coefficient of 10.76 for the age65 variable in that equation suggests that a one percent increase in a state’s percentage of its population over the age of 65 will lead to the 10.76 percent increase in that state’s mean cumulative death rate.

Fig. 9
figure 9

COVID deaths by age cohort as a percentage of all COVID deaths from January 2020 to February 2023

The importance of including age65 as an explanatory variable can also be seen in Fig. 11, which reproduces Fig. 4 that shows the relationship between only two variables—the vaccination rate and death rate. Florida and Maine are highlighted in Fig. 10. Those two states are the only two with the age65 variable above 20 percent. It is likely, therefore, that the relatively high level of unexplained variation for Florida and Maine from the trendline in Fig. 10 is at least partially explained by a greater proportion of those state’s populations being vulnerable to the lethal nature of COVID.

Fig. 10
figure 10

Relationship between mean vaccination rates and cumulative COVID death rates by state from 3/10/21 to 12/28/22

4.6 Chronic disease and other health-related variables

The obesity variable was not significant in Regression 6, Table 3. While the drugscore variable in Regression 6, Table 3 was moderately significant in that regression (p < 0.10), it fell out of significance in Regression 7, Table 3 when the racial/ethnic variables were added to the model.

The chronic variable, however, was highly significant (p < 0.01). Its inclusion increased the adjusted R-squared value from 0.64 to 0.75. The linear relationship between COVID death rates and chronic disease is shown in Fig. 11. This measure was obtained from a self-rated survey conducted by the C.D.C.’s Division of Population Health from its Behavioral Risk Factor Surveillance System (BRFSS), which “uses complex sampling designs and weights” in surveying all states for chronic disease risk factors. (Holt et al., 2015).

Fig. 11
figure 11

Relationship between chronic disease and cumulative COVID death rates by state from 3/10/21 to 12/28/22

The demographic group included in the 2021 survey was resident adults aged ≥ 18 years. This measure is based on self-assessment only and does not include an objective health component. The numerator for the chronic variable was all respondents, who self-reported their health status as fair or poor. The denominator included all respondents, excluding those who refused to answer, had a missing answer or answered: “don’t know/not sure.” (Holt et al., 2015).

Table 6 includes the percentage of those who responded as having “fair or poor” health in alpha order and rank order by state. It might be argued that the self-reporting during the survey year of 2021 is potentially biased since the respondents might have been inclined to report their health status as fair or poor if they recently experienced a coronavirus infection. The data, however, do not support that view. The average mean score of 14.2 for the chronic variable for the 2021 survey year, as well as that for the COVID year of 2020 survey of 12.4, were both lower than the mean chronic scores of 17.3, 16.5 and 17.3 for the 2017, 2018 and 2019 surveys, respectively. (Source: C.D.C. U.S. Chronic Disease Indicators).

Table 6 Fair or poor self-rated health status among adults aged ≥ 18 years 2021 Data

Figure 12 reproduces Fig. 4, that shows the relationship between the vaccination rate and death rate. Notice that Oklahoma, West Virginia, and Kentucky deviated sharply from the trendline. A major reason for being statistical outliers in Fig. 12 is likely related to their high percentage of chronic disease, as reported in the C.D.C. survey. That survey, as shown in Table 6, showed that Kentucky ranked #1 in chronic disease, with West Virginia at #3 and Oklahoma at #6.

Fig. 12
figure 12

Relationship between Mean Vaccination Rates and Cumulative COVID Death Rates by State from 3/10/21 to 12/28/22

More importantly, the regression shown in Table 3 points to a clearer explanation of causality in explaining COVID mortality. When a poverty variable was added in Regression 3, Table 3, the coefficient for the vaccination variable, vr, declined sharply from − 4.32 to − 2.15, with the poverty variable showing significance at the 0.01 level. But when the chronic disease variable, chronic, was added in Regression 6, Table 3, the poverty variable was no longer significant, while the chronic variable was significant at the 0.01 level.

These empirical results suggest that poverty and chronic disease are highly correlated as reflected by a correlation coefficient of 0.62 (p < 0.01) as well as the scatter diagram shown in Fig. 13.

Fig. 13
figure 13

Relationship between Poverty Rates and Chronic Disease Rates

Since the chronic disease variable displaces the poverty variable in the regression results shown in Table 3, it appears that the line of causality is as follows:

Higher Poverty → More Chronic Disease → More COVID Deaths.

While poverty might serve as a proxy for chronic disease, these empirical findings suggest that the direct impact of chronic disease on COVID mortality is more predictive than the indirect effect of poverty in explaining COVID mortality.

In a study by Robert Barro, the state lifespan in 2018 was used as an explanatory variable in his first-stage regression tests. Given the high correlation between chronic disease and lifespans across state, as shown in Fig. 14, it might be thought that lifespans can serve as a useful proxy for chronic disease. This, however, is not the case. When that lifespan variable was substituted in the present study in place of chronic disease in Regression 8, Table 3, the adjusted R2 value fell from 0.77 to 0.63. The measured t statistic for the life span variable was -3.31 as compared to 7.04 for the chronic disease variable.

Fig. 14
figure 14

Relationship between Chronic Disease and Lifespans

These findings point to the critical importance of using a chronic disease variable in any regression tests across states. Excluding the direct impact of chronic disease in explaining COVID deaths will bias upwards the efficacy of COVID vaccinations in reducing death rates. The reason for this is that states with lower vaccination rates also tend to have higher rates of chronic disease. That means the exclusion of chronic disease from the equation will incorrectly lead to a higher estimated number of lives saved from COVID vaccinations. In reality, when vaccination rates are low, many of the deaths in a state occur not necessarily because of the low rates of vaccination but because that state has a higher rate of chronic disease. Similarly, states with relatively higher rates of vaccination tend to have lower rates of chronic disease. As a result, estimates of lives saved in a state because of a higher rate of vaccination will tend to attribute the lower death rate in that state exclusively to its higher vaccination rate when, in fact, a significant portion of that lower death rate is due to lower rates of chronic disease.

It should be noted that the estimated lives saved, as calculated in Table 4, were based on the estimated coefficient of − 2.28 (see Column 4, Table 3). If the chronic disease is not held constant as a regressor, the estimated coefficient for the vaccination rate would be − 4.32. Using that coefficient instead of − 2.28 would result in an inflated estimate of 809,000 lives saved instead of the 427,000 estimate in this study, as shown in Table 4.

4.7 Racial/Ethnic Variables, afram, asian, hispan, and ai/an

The only racial/ethnic variable that exhibited significance (p < 0.01) was for American Indians/Alaska Natives (ai/an). As shown in Regression 7, Table 3, the coefficient for ai/an was 2.39, which suggests that a one percent increase in ai/an in a state leads, on average, to a 2.39 percent increase in that state’s mean cumulative COVID death rate over the 3/10/21 to 12/28/22 period.

The fact that the ai/an variable is such a significant explanatory variable is noteworthy, given the relatively low percentages of ai/an in most states. For example, 40 states have ai/an percentages within a very low range of 0.15 to 1.28 percent. These low percentages present analytic challenges in making statistical inferences about statewide differences in ai/an in regression modeling.

In spite of these challenges, the high degree of significance of the ai/an variable in Regression 8, Table 3 attests to the serious health challenges faced by the American Indian community. As stated in C.D.C.’s Morbidity and Mortality Weekly Report, compared with other racial/ethnic groups, American Indians/Alaskan Natives (AI/AN) have a lower life expectancy, lower quality of life and are disproportionately affected by many chronic conditions (Adakai et al., 2018). The health challenges facing American Indians are perhaps best reflected by the fact that, on average, American Indians die 12–13 years earlier than White Americans. (Gorzig et al., 2022).

Factors that help explain the health disparities for American Indians include low income, low educational attainment rates, health access, and higher crime, pollution, and injury rates. (https://www.cdc.gov/healthequity/whatis/index.html).

5 Conclusion

Much controversy has arisen over the efficacy of COVID vaccinations and the efforts taken or not taken by state governments to increase their mean vaccination rates. In spite of this, no academic papers have been published that directly examine statewide differences in vaccination rates while holding constant the impact of chronic disease. This study hopes to fill the gap by presenting a regression test that measures the hypothesized impact of vaccinations while holding chronic disease and other explanatory variables on each state’s COVID death rate.

The empirical findings presented in Table 3 show that the vaccination rate, the percentage of people over age 65, the percentage of people with chronic disease, and the percentage of American Indian/Natives, are all significant at the p < 0.01 (one-tailed) level or higher and have the hypothesized signs of association.

On average, the regression findings suggest that a state’s COVID death rate changes by − 2.28 deaths per 100,000 in a state’s population for every increase of one percent in a state’s vaccination rate. That value, coupled with each state’s mean vaccination rate over the 3/10/21 to 12/28/22 period, makes it possible to estimate the reduction in each state’s COVID death rate. The resulting reduction in each state’s death rate because of vaccinations can be used to solve each state’s reduction in the number of COVID deaths. As shown in Table 4, the regression findings suggest that COVID vaccinations led to a reduction of about 427,000 COVID deaths. Since the actual number of COVID deaths during the 3/10/21 to 12/28/22 period was 577,000, the estimated reduction of 427,000 COVID deaths during that same period suggests that COVID deaths would have reached about 1 million had vaccinations not been available. Similarly, the nation’s COVID death rate is estimated to have declined from 0.301 deaths as a percent of the nation’s population to 0.173 because of vaccinations.

The regression findings also suggest that at least over the 3/10/21 to 12/28/22 period, variables representing state population density, income, and stringency were not found to be significant in explaining COVID death rates.

The percentage of a state’s population over the age of 65, age65, was highly significant in explaining statewide differences in COVID death rate. Specifically, Regression 8, Table3, suggests that an increase of one percent in the percentage of people over the age of 65 in state leads to a 10.76 increase in that state’s COVID death rate.

The only health-related variable that proved to be significant was one that measured the percentage of people in a state who self-reported as suffering from chronic disease. That chronic disease variable was highly significant and suggested a one percent increase leads to an 11.97 percent decrease in a state’s COVID death rate. It was shown that the direct effect of chronic disease in determining COVID death rates across states is a more accurate regressor than indirect proxies of health such as poverty rates or lifespans across states.

Of the four racial/ethnic variables tested, only the American Indian/Native variable was significant. This empirical finding points to the serious health challenges facing American Indians.

This empirical study focused on the benefits of COVID vaccinations in reducing COVID death rates. Future research should extend these findings to measure the costs of vaccinations in order to derive a broader analysis of the relevant costs as well as benefits.