To analyze racial and ethnic disparities, we combine ZIP code level data on COVID-19 outcomes from state and local government websites with data from (1) the 2018 ACS 5-year sample, (2) the 2010 Census, (3) the Opportunity Atlas, (4) SafeGraph mobility data, (5) health professional shortage areas published by the Health Resources & Services Administration, and (6) conditional life expectancy published by the Centers for Disease Control and Prevention.
From state and local websites, we gathered COVID-19 data at the ZIP code level for six metropolitan areas: New York City, Chicago, Atlanta, San Diego, St. Louis, and Baltimore.Footnote 3 The ZIP codes include both the city proper, and in some cases the surrounding county. Cross-sectional data was gathered from June 6, 2020, to June 9, 2020, for these localities.Footnote 4 Once merged with all other sources, the full analysis uses 436 ZIP codes, with 177 in New York City, 58 in Chicago, 49 in Atlanta, 95 in San Diego, 21 in St. Louis, and 36 in Baltimore. Overall, there are approximately 17.7 million people living in these ZIP codes, with nearly half residing in New York City.
Our primary outcome variable is confirmed COVID-19 cases per 10,000 population. Although serological surveys provide strong evidence that confirmed cases are an undercount of total infections, confirmed case numbers still have clear clinical and economic significance. Nationally, the fatality and hospitalization rates for confirmed cases were roughly 5% and 10%, respectively, by June 2020.Footnote 5 Even after discharge from a hospital, persistent symptoms may remain (Carfì et al. 2020). In addition, confirmed infections (which tend to be more severe that those that remain undetected) undoubtedly lead to lost earnings, family strain, psychological distress, and potentially harmful long-term consequences (Eisenberg et al. Forthcoming). Kniesner and Sullivan (Forthcoming) estimate economic losses from COVID-19 at $46,000 per non-fatal case, by applying value per statistical life and relative severity/injury estimates from the Department of Transportation.
All six localities provided counts of COVID-19 cases, which are scaled into counts per 10,000 population. When weighted by population, Table 1 shows the median ZIP code had 143 cumulative cases per 10,000 population by early June, translating into a cumulative measured infection rate of about 1.4%. Measured infection rates varied substantially, with 12 cases per 10,000 (0.1%) in the lowest decile and 315 cases per 10,000 (3.2%) in the highest decile. In the aggregate, New York City had the highest rate of confirmed cases (2.3%), followed by Chicago (1.7%) and Baltimore (0.9%). The other three localities had confirmed case rates varying from 0.26 to 0.48%. In the aggregate, there were more than 271,000 confirmed COVID-19 cases in these 6 cities, approximately 14% of all cases nationally by that point.Footnote 6
We also conduct auxiliary analyses using a subsample of only Chicago and New York City, cities that provide additional data that allow us to investigate two important issues. First, do observed disparities in confirmed COVID-19 cases accurately reflect disparities in illnesses, or are they confounded by geographic variation in availability of tests and criteria for obtaining them? Note that we will include city fixed effects in all our models, which alleviates this concern to some extent. Nonetheless, since Chicago and New York City report tests run by ZIP code, they enable us to control for testing more directly. Second, are racial and ethnic disparities in COVID-19 fatalities the result of a higher likelihood of catching the virus, a greater risk of dying conditional on catching it, or some combination of both? Answering this question requires data on COVID-19 fatalities—not just cases—and Chicago and New York City are the only cities in our sample that report deaths by ZIP code. The bottom panel of Table 1 shows data for those two cities (and 235 ZIP codes). In these cities, the cumulative fatality rate from COVID-19 was 0.17% by early June.
Census Bureau Data
We merged this information to the 2018 ACS 5-year sample, as well as to the 2010 Decennial Census using Social Explorer (which provides summary statistics at the ZIP code level).Footnote 7 The ACS contains a rich set of variables on demographics, economic outcomes, and housing characteristics. Demographic variables include percent male, percent foreign born, and percent aged 18–44, 45–64, 65–74, and 75+ (children under 18 are omitted). Housing variables include density, percent renters, percent vacant units, percent overcrowded (1.5 or more persons per bedroom), and percent of units with 0 or 1 bedrooms. The 2010 Census—although dated—provides information about group quarters, specifically percent of population in nursing homes, correctional facilities, college dormitories, and military barracks. Returning to the ACS 5-year sample, our socioeconomic variables include percent in education bins (dropout, high school, some college, bachelor’s degree, the group beyond a bachelor’s degree is omitted as a reference category), income inequality as measured by the Gini coefficient, and percent in poverty bins (0–49% FPL, 50–74%, 75–99%, 100–149%, 150–199%, 200%+ is omitted). Occupation variables include percent of workers in service occupations, sales, farming, construction, production, or transport (managerial occupations omitted). Transportation variables include percent of workers of workers who use a car, percent who use public transportation, and percent with long commuting times (60+ min). Finally, one of our measures of health access—percent without health insurance—comes from the ACS.
Opportunity Atlas Data
The Opportunity Atlas is a collaboration of the Census Bureau, Harvard University, and Brown University that uses anonymous data following 20 million Americans from childhood to their mid-30s, with many outcomes measured at the Census Tract level (which we aggregate up to ZIP code).Footnote 8 As noted by Chetty et al. (2018), neighborhoods matter at a very granular level, where neighborhoods even one mile away have very little predictive power for child outcomes. We focus on two key variables, which represent long-run opportunity. The first is average annual household income ranking in 2014–2015 for children (in their mid-30s) who grew up in the area, based on having a low-income parent (25th percentile). The second is fraction of male children who grew up in the area who were in prison or jail on April 1, 2010. We aggregate Census Tracts to the ZIP code level using a crosswalk provided by the Missouri Census Data Center.Footnote 9 We follow the spirit of Courtemanche et al. (2017) by assigning Census Tract to the ZIP code where the plurality of residents live. In practice, approximately 53% of Tracts nationally map into one ZIP code only, and roughly 75% of Tracts have at least 80% of their population in one ZIP code.
Many recent COVID-19 studies examine mobility using data from SafeGraph, which provides access to their data through free, non-commercial agreements.Footnote 10 Following Gupta et al. (2020), we compute the fraction of cell phone devices that were detected to be entirely at home during the day, aggregating from the Census Block Group level to the ZIP code level. We aggregate Census Block Groups to ZIP codes using a crosswalk provided by the Missouri Census Data Center. We computed daily averages for each ZIP code, and then averaged across all days for the months of March 2020, April 2020, and May 2020. Again, following Courtemanche et al. (2017), we assign Census Block Groups to the ZIP code where the plurality of residents live. In practice, approximately 73% of Census Block Groups nationally map into one ZIP code only, and roughly 85% of Tracts have at least 80% of their population in one ZIP code.
Health Professional Shortage Area Data
We incorporate information on each ZIP code’s status as being designated as a health professional shortage area (HPSA) for federal fiscal year 2020.Footnote 11 HPSAs are designated by the Health Resources and Services Administration (HRSA) to signify areas as medically underserved. The Centers for Medicare and Medicaid Services (CMS) provide HPSA designation status at the ZIP code level to signal to eligible health care professionals (e.g., physicians) if the location where they practice medical care is eligible for enhanced Medicare reimbursements per the 2005 Medicare Modernization Act (CMS, U.S. Centers for Medicare, and Medicaid Services 2020). This feature of the program creates a financial incentive for delivering care in medically underserved settings with historically higher uninsured rates and limited access to care. We use this as a proxy measure to capture differences in access to primary care and mental health services. HPSAs can be entire counties, but are most commonly smaller portions of a county—this is particularly the case in larger cities.
Centers for Disease Control and Prevention Data
Our population health variables are conditional life expectancies obtained from the U.S. Small-area Life Expectancy Estimates Project (USALEEP).Footnote 12 The files contain conditional life expectancies for different age bins at the Census Tract level; in our model, we include conditional life expectancies for ages 65–74, 75–84, and 85 plus, and aggregate from the Tract level to ZIP code. Many commentators attribute disparities in COVID-19 cases and deaths to underlying health conditions such as elevated rates of chronic illnesses among Blacks and Latinos (Artiga et al. 2020). We use variation in life expectancy—and focus on the elderly who are most vulnerable to COVID-19—to control for variation in underlying health status as well as the risk factors leading to differences in preventable mortality.
Table 2 shows, along some margins, large differences in neighborhood characteristics depending on racial and ethnic composition. Out of the 436 ZIP codes, 188 are majority White, 84 are majority Black, 49 are majority Hispanic, and 115 are none of these. With respect to demographics, Hispanic neighborhoods have much higher representation of foreign-born individuals. With respect to housing, there are more renters in majority-Black and majority-Hispanic neighborhoods. Lower educational attainment and higher poverty levels are also common attributes of these neighborhoods, as are lower levels of income mobility—an indicator of long-run opportunity. At least some types of workers whose jobs do not easily transfer online—those in service occupations—are more prevalent in predominantly Black and Hispanic neighborhoods. Also common in predominantly Black and Hispanic neighborhoods are larger dependence on public transit as a key mode of transportation (McLaren 2020) and longer commuting times. Cell phone mobility measures are relatively similar, on average, across neighborhoods. Health care access is worse for Black and Hispanic neighborhoods according to both percent uninsured and mental health HPSA, while population health—proxied by conditional life expectancy—is fairly similar across neighborhood types, especially from age 75 onward. Finally, racial composition and segregation varies by city. None of the ZIP codes in San Diego are majority Black, while Atlanta, Baltimore, and St. Louis have no ZIP codes that are majority Hispanic.