Introduction

In the USA, the policy response to COVID-19 was primarily left to state governments. In the absence of a vaccine or treatment for the virus, states instituted a series of non-pharmaceutical interventions (NPIs) to slow the spread of the virus, such as mask mandates, stay-at-home orders, and social distancing guidelines. The success of the NPIs that states implemented was largely dependent on compliance of individuals and businesses. Nagler et al. (2020) show that nearly 75% of adults were exposed to conflicting information about the severity of COVID-19 and the effectiveness of NPIs from media outlets and politicians (see Barker 2020, Collins 2020, and Schumaker 2020 as examples). For those exposed to conflicting information, accurately deciphering data and statistics published by the Centers for Disease Control (CDC), World Health Organization (WHO), and local public health agencies may have been difficult.

A higher level of education is associated with better health (Conti et al. 2010; Cutler and Lleras-Muney 2010; Lleras-Muney 2005) and is an important factor in the adoption of preventative measures against infectious disease (Bawazir et al. 2018; Leung et al. 2003). However, education in mathematics and science would be necessary when evaluating the data and statistics prepared by public health agencies. Davis and Simmt (2003) showed that education in mathematics is crucial to understanding the complexity of science and epidemiology. While Ang et al. (2020) found that education in mathematics and science impacts the number of COVID-19 cases, they did not assess the relative role of mathematics and science on one another.

Currently, data on science scores at the county level are not available, so in this analysis, we investigate how education in mathematics in young people (ages 18–25) contributed to the spread of COVID-19 in the USA using a county-level cross-sectional analysis. The Stanford Education Data Archive provides normalized standardized mathematics test scores for eighth grade showing the average level of mathematics comprehension at the county level. We find that a one-grade-level increase in mathematics comprehension (i.e., if a county’s eighth graders had an average level of comprehension of 9th graders) resulted in a 7–15% reduction in the number of COVID-19 cases. This suggests that education in mathematics for young adults was an important factor in reducing the spread of COVID-19 in the USA. Secondly, we find that the higher the vote share (e.g., the percent of the vote) for Joe Biden, a one-grade-level increase in math comprehension led to a greater reduction in the number of COVID-19 cases. While not a causal result, one interpretation of this result would suggest that while education in mathematics is important, political affiliation and the ability to synthesize information that came from political leaders were also factors in slowing the spread of COVID-19. These findings suggest that if states and localities place a greater emphasis on education in mathematics, it may help in preventing the spread of disease in a future public health crisis.

Data

To study the impact education in mathematics in young adults had on the spread of COVID-19 in the USA, we construct a cross-sectional dataset at the county level from several sources. Our primary variable of interest is the level of mathematical comprehension at the county level. We use the average standardized mathematics scores from eighth grade from 2009 to 2015. These data come from the Stanford Education Data Archive (SEDA) (Reardon et al. 2021). The SEDA test scores report the average “level” of mathematics comprehension for each county as a continuous variable from 4-12 based on each individual state’s standardized testing criteria and scores. These values correspond to the average grade level of comprehension for each county. For example, if this variable has a value of 7 for a specific county, the average eighth grade level of mathematics comprehension is that of a seventh grader. Data are not available prior to 2009; therefore, this study focuses on the impact of education in mathematics for young adults as those who were in eighth grade in 2015 would have been 18–19 in 2020 and those who were in eighth grade in 2009 would have been 24–25. While these data are useful and valuable, it is not without limitations. It is possible that adults may not live in the same county as they did in the eighth grade. We are unable to measure what percent of current residents took the standardized test in the eighth grade within the same county. However, survey data show that 54–72% of Americans live in or near the city that they grew up in North American Van Lines (2019), Heartland Monitor Poll (2015), with the average American living within 18 miles of their mother (Bui and Miller 2015), and the number of people relocating from where they grew up has been decreasing over the last few decades (Malloy et al. 2011). This is an imperfect measure, but likely captures the effects of the majority of households in most counties.

Data on COVID-19 cases come from USA Facts (2021). The variable for COVID-19 cases includes all cases in the USA for all age groups and is adjusted to be cases per 1000 people to normalize across counties for population. We look at COVID-19 cases 30, 60, and 90 days after each county reported its first case. These dates were chosen to see the impact of education in mathematics in the early stages of the pandemic before there was a rapid growth in cases; before the politicization of mask mandates and stay-at-home orders (as noted in Kahane 2021 and Dyer 2020); before some states started partially lifting restrictions; and before vaccines became available. We chose these dates because they were far enough apart to ensure that our estimates were stable across time, but not too far apart where other factors may have a large influence (e.g., some states started lifting restrictions and vaccines becoming available).

We use a series of control variables that include the percent of each county that is uninsured, physically inactive, has some college education, is over age 65, and has poor health, median household income, and the primary care rate for 2020. These data come from the County Health Rankings and Roadmap from University of Wisconsin Population Health Institute 2021.

To test for possible heterogeneous effects and to understand how people processed information, we also use county-level data on the 2020 presidential election to construct a variable for the percent of each county that voted for Joe Biden. The data used to construct this variable come from the MIT Election Data and Science Lab 2018.

Descriptive statistics for all variables used in this analysis can be found in Table 1, and a detailed description of each variable can be found in Table 5 in “Appendix”.

Table 1 Descriptive statistics

Methodology

Mathematics sores

To investigate the impact of education in mathematics on the spread of COVID-19, we estimate a series of ordinary least squares (OLS) and fixed effect regressions. Our primary specifications take the following form:

$$\begin{aligned} \begin{aligned} \log (Cases_i)&= \beta _0 + \beta _1MathScore_i + \beta _2SomeCollege_i\\&\quad + \beta _3 \log (Income_i) + \beta _4PoorHealth_i\\&\quad + \beta _5Uninsured_i + \beta _6Inactive_i + \beta _7PrimaryCareRate_i \\&\quad + \beta _8Over65_i + \theta _s + \varepsilon _i \end{aligned} \end{aligned}$$
(1)

The dependent variable, \(Cases_i\), is all reported COVID-19 cases per 1000 people for county i. Equation 1 is estimated separately for \(Cases_i\) 30, 60, and 90 days after the first case was reported in each county.

The main variable of interest in this study is \(MathScore_i\), which is the average observed grade level of math comprehension for eighth graders in county i. To further control for education, we also include \(SomeCollege_i\), which shows the percent of county i that has some college education. Controlling for the those that have some college education allows us to focus on the role of education at the K-12 level when interpreting \(MathScore_i\).

We also include a series of variables as controls that are also likely to influence the number of COVID-19 cases. The relationship between income and health is multifaceted through clinical, behavioral, and environmental factors (Khullar et al. 2018). Lower-income households are more likely to work jobs that are not able to be performed remotely and require human interaction (Parker et al. 2020; Gould and Shierholz 2020). Therefore, we include \(Income_i\), which is the median income in county i.

\(PoorHealth_i\) is a variable that indicates the percent of county i that is in poor health, \(Inactive_i\) is the percent of county i that is not physically active, and \(Over65_i\) is the percent of county i that is over the age of 65. These are important factors to control for as they increase the risk of severe illness from COVID-19 (Centers for Disease Control 2021).

Tolbert et al. (2020) note that people without health insurance are less likely to seek help when they get sick. This may have led some people without health insurance to continue about their daily activities while infected with COVID-19, possibly spreading it to others. Because of this, we include the variable \(Uninsured_i\), which measures the percent of county i that does not have health insurance. We also include the variable \(PrimaryCareRate_i\), which is the rate of primary care physicians in county i because people may have been less likely to get tested or treated for COVID-19 in counties with a low rate of primary care physicians.

Lastly, we include \(\theta _s\) as a state fixed effect to control for state level policies, such as mask mandates and stay-at-home orders. We recognize that some counties implemented their own policies that may have been stricter or implemented at different times from the state policy. To our knowledge, these data are not currently available at the county level and this represents a limitation of the current analysis. We estimate Eq. 1 with and without the state fixed effect as a robustness check to ensure that the state policies did not change our main result.Footnote 1 We also estimate Eq. 1 with mathematics scores from 2009 and 2015 separately to see if there is any difference in age groups. People who took the test in 2009 would be 24–25 during the pandemic and people who took the test in 2015 would be 18–19. The first group will include most working young adults, and the second group will include recent high school graduates and college students.

Political preferences

As states implemented mask mandates, stay-at-home orders, and social distancing guidelines to try to slow the spread of COVID-19, there was intense politicization and resistance by President Donald Trump and some Republican members of Congress (Kahane 2021; Dyer 2020). These measures were a debated issue in the 2020 presidential election. As such, it is possible that people who live in counties who voted for Donald Trump over Joe Biden may have been less inclined to follow the guidelines and mandates. We estimate a second set of regressions to test this:

$$\begin{aligned} \begin{aligned} \log (Cases_i)&= \beta _0 + \beta _1MathScore_i + \beta _2MathScore_i\\&\quad \times BidenPercent_i + \beta _3BidenPercent_i\\&\quad + \beta _4SomeCollege_i + \beta _5\log (Income_i) \\&\quad + \beta _6PoorHealth_i + \beta _7Uninsured_i\\&\quad + \beta _8Inactive_i + \beta _9PrimaryCareRate_i\\&\quad + \beta _{10}Over65_i + \theta _s + \varepsilon _i \end{aligned} \end{aligned}$$
(2)

where \(BidenPercent_i\) is a variable that represents the percent of each county that voted Joe Biden in the 2020 presidential election. The interaction term allows us to calculate the marginal effect of a one-grade-level increase in mathematics scores on the number of COVID-19 cases by the vote share of each county that went to Joe Biden. We estimate Eq. 2 for all counties, only urban counties, and only rural counties.

Results

Mathematics scores

Fig. 1
figure 1

Relationship between math scores and COVID-19 cases. Notes: This figure shows the relationship between standardized mathematics score and logged COVID-19 cases at the county level. Math scores are from the Stanford Education Data Archive and are a scaled value of a specific counties test scores relative to the national average

Table 2 OLS regression results for math scores on log COVID-19 cases

Figure 1 shows the relationship between log COVID-19 cases and mathematics scores 30, 60, and 90 days after each county reported its first case. There is a clear inverse relationship between the number of COVID-19 cases and mathematics scores at all three time points. To empirically test to see if this result is maintained when adding controls for other factors that may influence the spread of COVID-19 at the county level, we present the results of Eq. 1 in Table 2. columns 1, 2, and 3 show the results without the inclusion of a state fixed effect, and columns 2, 4, and 6 show the results with the state fixed effect. A one-grade-level increase in mathematics scores leads to a 15% reduction in COVID-19 cases 30 days after the first case was reported in each county. The estimate is statistically significant at the 1% level and does not change when adding a fixed effect, indicating this result is robust to variation in state level policies. An alternative way to think about the interpretation of this coefficient is if a county can increase the average level of mathematics comprehension by one-grade level, that would have led to a 15% reduction in cases 30 days after the first case was reported.

To ensure that this effect is stable across time, we also report the regression results for 60 and 90 days after each county reported its first case. We find that there remains a statistically significant decrease in COVID-19 cases 60 and 90 days after each county reported its first case. After 60 days, a one-grade-level increase in mathematics scores led to a 14% reduction in the number of COVID-19 cases. When including a state fixed effect, the effect drops to 11%. After 90 days, a one-grade-level increase in the mathematics scores led to a 9% reduction in the number of COVID-19 cases, and when including a state fixed effect, the effect drops to 7%.

When looking at the effect of comprehension in mathematics, the effect decreases over time. This may be because early in the pandemic, people responded by changing their mobility patterns—people spent more time at home and less time in places of work and other public places—however, people started to return to their pre-pandemic mobility patterns over time, even before the lifting or modification of stay-at-home orders (Murray 2021). It is possible people faced a trade-off between income and health during the pandemic (Palma et al. 2020) and were worried about financial security and dealing with loneliness (Tull et al. 2020). Additionally, the way people synthesized and processed information as well as the source individuals got their information from could have influenced how people behaved over time, something we will explore further in the next subsection.

The estimates from Eq. 1 use the average mathematics test scores from 2009 to 2015 for each county. However, people who were in eighth grade in 2009 would have been 24–25 years old and people who were in eighth grade in 2015 would be 18–19 years old in 2020. While only a 6-year difference, those who are 18–19 versus 24–25 could represent very different groups. Since we do not have panel data on individuals, we are unable to see who went to college and who did not from a given county, but it is likely that the 24–25 age group in 2020 will capture most people who have gone to college and are more likely to be married and have kids compared to those aged 18–19. In Table 3, we estimate Eq. 1 separately for mathematics scores in 2009 and 2015 to see if there is any difference in the two cohorts of students. While mathematics scores from 2015 have a slightly larger effect than math scores from 2009, there were no statistical differences between the two groups.

Table 3 OLS regression results for math scores by cohort on log COVID-19 cases

Political preferences

Table 4 OLS regression results for math scores on log COVID-19 cases by 2020 election results

Kahane (2021) and Dyer (2020) show that there was politicization of the NPIs and state policies that were enacted to slow the spread of COVID-19. Nagler et al. (2020) show that this led to conflicting information being presented to the public. Mitchell et al. (2016) show that people tend to get their news from the same sources and there is a divide between where people of different political affiliations get their news. The debate over the NPIs took center stage in the 2020 presidential election between Donald Trump and Joe Biden. Because of this, we estimate Eq. 2 to see if there was difference in the impact of mathematics scores based on the vote share in each county for Joe Biden. These results can for all counties be found in columns 1, 2, and 3 of Table 4 and the marginal effects can be found in Panel A of Fig. 2. We find there is an inverse relationship between the vote share for Joe Biden and the percent change in COVID-19 cases. The more people that voted for Joe Biden in a county, a one-grade-level increase in mathematics scores among young adults had an increasing effect on the reduction of COVID-19 cases at 30, 60, and 90 days after the first case. This effect is statistically significant where Joe Biden won 25% or more of the vote 30 days after the first case, 40% or more after 60 days, and 45% or more after 90 days.

Fig. 2
figure 2

Marginal effects of an increase in mathematics scores

Fig. 3
figure 3

Marginal effects of an increase in mathematics scores for states where the young adult vote share went to Donald Trump

Generally, people lack perfect information and the capability to completely process information and thus use heuristic principles to reduce the complex task of assessing probabilities and making value judgements (Tversky and Kahneman 1974; Albar and Jetter 2009). Heuristics generally leads to good in decision making, but can sometimes lead to severe and systemic errors in outcomes (Tversky and Kahneman 1974). We do not know if people’s heuristic principles differ based on their political affiliation and who they supported in the 2020 presidential election. As such, this result could have multiple interpretations. These data could indicate that people who live in counties that voted for Donald Trump were using less mathematical reasoning compared to those who live in counties that voted for Joe Biden in understanding the risks associated with COVID-19. It also could also be due to the messaging coming from Donald Trump and the media outlets commonly watched by his supporters that helped reinforce their prior conclusions compared to supporters of Joe Biden who may consume information from different outlets and reinforce different conclusions. There are many complicated factors that could influence how people consume, process, and evaluate the credibility of information based on political affiliation, which was likely a factor in how people chose to comply with NPIs, particularly in the middle and later stages of the pandemic. These are difficult to untangle with the data at hand but is something that is worth exploring further.

It is possible that this result is capturing urban areas instead of the vote share because urban areas were more likely to vote for Joe Biden than rural areas. We estimate Eq. 2 separately for just urban counties (results shown in columns 4 5, and 6 of Table 4 and marginal effects in Panel B of Fig. 2) and rural counties (results shown in columns 7, 8, and 9 of Table 4 and marginal effects in Panel C of Fig. 2). For both urban and rural counties, the inverse relationship holds. For urban counties, there is a statistically significant effect for 30, 60, and 90 days after the first case when Joe Biden got 50% or more of the vote share. For rural counties, the effect is only statistically significant when Joe Biden has between 50 and 80% of the vote 30 days after the first case and when Joe Biden got 60% or more of the vote 90 days after the first case. The effect shows no statistical significance 60 days after the first case. While the effect is not as strong in rural areas, there is some evidence that this effect holds for rural areas. The slight difference in results between urban and rural counties could be that the returns to education tend to be higher in urban areas relative to rural areas (Baum-Snow et al. 2018; Gould 2007; Combes and Gobillon 2015) and this difference has grown in the last two decades (Autor 2019). Therefore, counties with higher math scores may have a larger impact in urban areas relative to rural areas.

Since our focus is restricted to young adults (ages 18–25), it is also possible our results are influenced by the fact that young adults were more likely to vote for Joe Biden than Donald Trump in the 2020 presidential election Center for Information & Research on Civic Learning and Engagement (2020). To see if the effect still holds, we estimate Eq. 2 restricted to only states where more young adults voted for Donald Trump than Joe Biden.Footnote 2 The marginal effects for this regression can be found in Fig. 3. There was still an inverse relationship showing that a one-grade-level increase in mathematics scores had a greater effect of reducing the number of COVID-19 cases as the vote share for Joe Biden increased. This effect is statistically significant when Joe Biden got 45% of the vote of more after 30 days and where he got 60% or more after 60 and 90 days.

These findings should not be interpreted as causal, as there could be other explanations for these effects that we have discussed. However, these findings could suggest that while better education in mathematics resulted in fewer COVID-19 cases across all counties, that political leanings also may have influenced how people responded to public policy and NPIs. This could be due to source credibility, the way people synthesized information, and/or how this varied to people of different political leanings.

Conclusion

In this paper, we show that counties with higher standardized test scores in mathematics among young adults had a lower number of COVID-19 cases. This effect was robust to variation in state level policies, but the effect did appear to decline over time. This decline could be due to financial insecurity of households as well as the different heuristic principles people used to consume and process information. As the COVID-19 pandemic is coming to an end, this is an important finding that may help combat future public health crises. If people have the proper training in mathematics, this may allow them to interpret the data on their own and not have to rely on conflicting information from news outlets and politicians.

Davis and Simmt (2003) show that understanding mathematics is important to understanding the complexities of the varying fields of science and epidemiology. At the time this manuscript was prepared, county level data on standardized science scores are not available, so we were not able to investigate the relationship between mathematics and science scores in greater detail, but future studies should seek to further explore this relationship.

Secondly, we also show that political affiliation may have had an impact on the spread of COVID-19. While we show that higher mathematics scores led to a lower number of COVID-19 cases, this effect was stronger as the vote share increased for Joe Biden in the 2020 election. We are not able to determine causality with this relationship, but there are opportunities for future researchers to explore the mechanisms of these differences.

The findings in this paper suggest that policymakers at the state and local level should seek to implement policies to increase the level of education and comprehension of mathematics at the K-12 level, which may help mitigate future public health crises. However, more work is needed in this area. Given the recency of the pandemic, there are data that are not currently available at the county level that would allow for a more thorough investigation into the impact of education in mathematics on the spread of COVID-19. As more data become available, there is an opportunity for future research to better establish the mechanisms through which education in mathematics may help combat public health crises.