FormalPara Key Summary Points

Why carry out this study?

Understanding the baseline characteristics of patients hospitalized with COVID-19 across a variety of US geographic regions is essential for targeting clinical care and allocating resources.

This study examined baseline demographics and clinical characteristics of patients hospitalized with COVID-19 and pulmonary involvement from a US electronic health records database that includes patients primarily from the South and Midwest; patients were stratified by geographic region, race, sex, and age.

What was learned from this study?

Among US patients primarily from the South and Midwest hospitalized with COVID-19 and pulmonary involvement, the most common comorbidities were hypertension, diabetes, hyperlipidemia, and obesity.

Compared with White patients, African American patients were younger, with higher mean BMI, higher prevalence of concurrent diabetes, and lower prevalence of COPD and smoking/tobacco use; it is important to view these data in light of underlying reasons for racial disparities in COVID-19.

This study provides real-world data from geographic regions with limited published COVID-19 data; such information can be used to help target clinical care and allocate resources.

Digital Features

This article is published with digital features to facilitate understanding of the article. To view digital features for this article go to https://doi.org/10.6084/m9.figshare.12961658.

Introduction

The global coronavirus disease 2019 (COVID-19) pandemic began in Wuhan, China, and quickly spread internationally [1]. As of July 28, 2020, there were over 4.3 million confirmed cases of COVID-19 in the US and over 148,000 deaths [2]. COVID-19 can present as a range of symptoms, from mild to critical; lower pulmonary involvement, including severe pneumonia, dyspnea, or blood oxygen saturation ≤ 93%, among other potential symptoms, is often associated with severe and critical cases [3].

Understanding the baseline characteristics of patients hospitalized with COVID-19 illness is essential for effectively targeting clinical care and allocating resources. Two large studies of patients hospitalized with COVID-19 illness in New York observed that the majority of patients were male and the most common comorbidities were hypertension, obesity, and diabetes [4, 5]. In comparison, a study of patients hospitalized with COVID-19 in Louisiana reported that the majority of patients were female and 70% were Black non-Hispanic; in that study, increased odds of hospital admission were associated with Black race, increasing age, higher Charlson Comorbidity Index (CCI), public insurance, residence in a low-income area, and obesity [6]. These studies demonstrate that patients from one state may not be representative of patients throughout the US and that studies across a variety of geographic regions are needed to specifically tailor clinical responses to COVID-19.

The objective of this study was to examine baseline demographics and clinical characteristics of US patients hospitalized with COVID-19 and pulmonary involvement. This study utilized the IBM Explorys® US electronic health records (EHR) database, which includes a racially diverse group of patients primarily from the South and Midwest. Patients in the database with COVID-19 and pulmonary involvement from December 1, 2019, to May 20, 2020, were included; patients were examined overall and stratified by sex, age, race, and geographic region.

Methods

Study Design and Patient Population

This retrospective study used EHR data from the US-based IBM Explorys® dataset from December 1, 2019, to May 20, 2020. This database is a convenience sample of health systems and contains data on approximately 70 million patients (since 2000) from 39 large health systems that include approximately 400 hospitals and 400,000 providers. Currently, the Explorys population comprises patients from the Midwest (39%), South (33%), West (18%), and Northeast (5%). Explorys includes EHR, outgoing billing, and adjudicated claims from both commercial and public payers. The venue of care is longitudinal, including inpatient, ambulatory, emergency, and post-acute care data. Data include diagnoses, procedures, laboratory results, vital signs, patient-reported outcomes, encounter-level data, providers, and other clinical, financial, and operational data.

This was a retrospective, observational study, and all study data were deidentified and fully compliant with Health Insurance Portability and Accountability Act Regulations (45 CFR 164.514e); therefore, approval from an institutional review board was not required, and informed consent was not obtained. Variables were measured using International Classification of Diseases, Clinical Modification (ICD-9-CM and/or ICD-10-CM) diagnosis, Systemized Nomenclature of Medicine, and procedure codes, Healthcare Common Procedure Coding System codes, and National Drug Codes, as appropriate.

Patients with a COVID-19 diagnosis code and a diagnosis code for pulmonary involvement based on the Centers for Disease Control and Prevention coding guidelines were included (Table S1 in the electronic supplementary material) [7]. Pulmonary involvement included pneumonia, acute bronchitis, bronchitis not otherwise specified, lower respiratory, associated with a respiratory infection not otherwise specified, and acute respiratory distress syndrome. If a patient had more than one hospitalization with COVID-19 and pulmonary involvement, only the first hospitalization was included in this analysis.

Outcomes

Patient demographics, baseline clinical characteristics, and prescribed medications were assessed using available data from up to 12 months through 14 days before hospital admission (baseline period) for patients who had at least one encounter during this period. Symptoms during the 14 days to 1 day prior to hospitalization were assessed separately. Demographics included age, sex, patient-reported race/ethnicity, geographic region, smoking status, body mass index (BMI), and month of admission. Self-identified race data were included as study variables to characterize hospitalized patients. Comorbid conditions included 17 CCI [8] conditions and several select comorbid conditions, including hypertension, diabetes, coronary artery disease, and respiratory conditions (chronic obstructive pulmonary disease, asthma, sleep apnea, and pulmonary fibrosis).

Statistical Analysis

Results are presented for the overall patient sample and were stratified by geographic region, race, sex, and age category (< 60 years and ≥ 60 years). To compare patient characteristics between cohorts in the stratified data, t tests and χ2 tests were used. P values < 0.05 were considered statistically significant.

Results

Demographics and Clinical Characteristics

Between December 1, 2019, and May 20, 2020, 91,380 patients had a pulmonary diagnosis; after applying exclusion criteria, 3471 patients had diagnoses of COVID-19 and pulmonary involvement during an inpatient admission and were included (Fig. 1; Table S1 in the electronic supplementary material). The most common pulmonary diagnoses were pneumonia (56.1%) and lower respiratory infection (43.4%).

Fig. 1
figure 1

Patient attrition. aPneumonia, acute bronchitis, bronchitis not otherwise specified, lower respiratory tract infection, other specified respiratory disorder, acute respiratory distress syndrome. COVID-19 coronavirus disease 2019

The mean (SD) and median age were 63.5 (16.3) and 64.0 years, respectively; 1776 patients (51.2%) were female and 1910 (55.0%) were African American (Table 1). Overall, 2834 patients (81.6%) were from the South and 583 (16.8%) were from the Midwest; the most highly represented states included Louisiana (1677 patients [48%]), Ohio (559 [16%]), and Maryland (461 [13%]) as well as the District of Columbia (496 [14%]). Patients’ insurance status included unknown (57.6%), private insurance (17.7%), Medicare (14.1%), self-pay (6.8%), and Medicaid (3.1%).

Table 1 Demographic characteristics prehospitalization

Overall, mean (SD) and median CCI were 0.92 (1.89) and 0, respectively, and the most common comorbidities (n [%]) during the baseline period were hypertension (961 [27.7%]), diabetes (600 [17.3%]), hyperlipidemia (566 [16.3%]), and obesity (338 [9.7%]; Table 2). The mean (SD) and median BMIs in patients with a recorded measurement (2460 patients [70.9%]) were 32.05 (8.85) and 30.70, respectively. A history of smoking/tobacco use was self-reported by 994 patients (28.6%). The most common (n [%]) preadmission prescriptions were atorvastatin (332 [9.6%]), amlodipine (312 [9.0%]), and metoprolol (238 [6.9%]). The most common (n [%]) select baseline prescriptions were steroids (314 [9.0%]), angiotensin II receptor blockers (ARBs; 256 [7.4%]), angiotensin-converting enzyme inhibitors (ACEi; 199 [5.7%]), and bronchodilators (143 [4.1%]). There were no baseline prescriptions for Janus kinase inhibitors, dornase alfa, baloxavir marboxil, or intravenous immunoglobulin. Overall, 1621 patients (46.7%) had a healthcare encounter during the 14 days to 1 day prior to their first COVID-19 admission; cough, fever, breathing abnormalities/dyspnea, and hypoxemia were reported in 443 patients (27.3%), 373 (23.0%), 246 (15.2%), and 42 (2.6%), respectively.

Table 2 Twelve-month baseline clinical characteristics prior to hospitalization

Analyses Stratified by Geographic Region

The largest geographic regions represented were the South (n = 2834) and Midwest (n = 583). Among patients from the South, 1723 (60.8%) were African American and 772 (27.2%) were White, while in the Midwest, 181 (31.0%) were African American and 347 (59.5%) were White. Patients from the South had lower mean (SD) CCI than patients from the Midwest (0.88 [1.87] vs. 1.12 [2.03]; P = 0.006); however, the median CCI was the same for both groups (0 for both). Lower proportions of patients from the South had hypertension (26.0% vs. 37.0%; P < 0.001), diabetes (16.5% vs. 21.3%; P = 0.005), hyperlipidemia (14.6% vs. 25.4%; P < 0.001), or obesity (8.6% vs. 15.8%; P < 0.001) than patients from the Midwest. In addition, a lower proportion of patients from the South had a history of smoking/tobacco use than patients from the Midwest (27.4% vs. 37.2%; P < 0.001). Higher proportions of patients from the Midwest had a baseline prescription for steroids, including inhaled steroids (21.6% vs. 6.6%), ACEi (9.8% vs. 5.0%), and bronchodilators (7.0% vs. 3.5%; P < 0.001 for all) than patients from the South. A lower proportion of patients from the South had a preadmission healthcare encounter in the 14 days to 1 day prior to their first COVID-19 admission than patients from the Midwest (42.2% vs. 69.6%; P < 0.001); there were no significant differences in preadmission pulmonary diagnoses among these groups.

Analyses Stratified by Race

Overall, 1910 patients self-identified as African American and 1146 as White. Among African American patients, 1723 (90.2%) were from the South compared with 772 (67.4%) of the White patients. African American patients were significantly younger than White patients (median, 63.0 vs. 69.0 years; mean [SD], 62.5 [15.4] vs. 67.8 [16.2] years; P < 0.001), and a larger proportion were female (54.8% vs. 49.6%; P = 0.005). The mean (SD) CCI was higher in African American patients than in White patients (1.03 [1.98] vs. 0.93 [1.89]), although the difference was not statistically significant; the median CCI was 0 in both groups. Higher proportions of African American patients had diabetes (19.8% vs. 16.7%; P = 0.032) and obesity (11.6% vs. 9.0%; P = 0.025) and a higher mean (SD) BMI (33.66 [9.46] vs. 30.42 [7.86]; P < 0.001) and median BMI (32.20 vs. 29.52) compared with White patients. A higher proportion of White patients had hyperlipidemia (19.6% vs. 16.5%; P = 0.030), chronic obstructive pulmonary disease (COPD; 8.2% vs. 5.6%; P = 0.005), and self-reported a history of smoking/tobacco use (37.2% vs. 28.1%; P < 0.001) than African American patients. Compared with White patients, lower proportions of African American patients had a prescription for steroids (13.3% vs. 8.0%; P < 0.001), ACEi (7.7% vs. 5.3%; P = 0.010), and bronchodilators (5.8% vs. 3.8%; P = 0.010). A lower proportion of African American patients had a preadmission healthcare encounter than White patients (44.8% vs. 56.1%; P < 0.001), but there were no significant differences in preadmission pulmonary diagnoses between these groups.

Analyses Stratified by Sex

Among all patients, 48.8% were male and 51.2% female; female patients were older than male patients (median, 66.0 vs. 63.0 years; mean [SD], 64.8 [16.9] vs. 62.2 [15.5]; P < 0.001). A higher proportion of female patients had COPD (6.9% vs. 5.0%; P = 0.017), and a lower proportion had a history of smoking/tobacco use (26.4% vs. 31.0%; P < 0.001) compared with male patients.

Analyses Stratified by Age

A higher proportion of patients who were < 60 years of age had a preadmission healthcare encounter than patients who were ≥ 60 years of age (51.9% vs. 43.4%; P < 0.001), with cough and breathing abnormalities/dyspnea reported in higher proportions of patients who were < 60 years of age than in those who were ≥ 60 years of age (33.3% vs. 22.8%; P < 0.001 and 18.3% vs. 12.7%; P = 0.002).

Discussion

This study examined baseline demographics and clinical characteristics among US patients primarily from the South and Midwest—regions with limited data on patients with COVID-19—who were hospitalized with COVID-19 illness and pulmonary involvement. Overall, 81.6% of patients were from the South and 16.8% from the Midwest. Two highly represented groups were female (51.2%) and African American (55.0%) patients. The mean (SD) age was 63.5 (16.3) years. Hypertension, diabetes, hyperlipidemia, and obesity were the most common comorbidities. Preadmission pulmonary symptoms were also examined, and the most common were cough and dyspnea.

The most prevalent comorbidities observed were the same as those reported after analysis of EHR data of 5700 patients hospitalized with COVID-19 from the New York City area, with the addition of hyperlipidemia, although they were present in a lower proportion of patients in this study vs. the New York study (hypertension, 27.7% vs. 56.6%; diabetes, 17.3% vs. 33.8%; obesity, 9.7% vs. 41.7%, respectively) [4]. A retrospective cohort study of patients diagnosed with COVID-19 in Louisiana, including nonhospitalized and hospitalized patients, reported similar rates of hypertension and diabetes but a higher rate of obesity than the present study [6]. Diagnoses for obesity, defined by the US Centers for Disease Control and Prevention as BMI ≥ 30, may be underreported in the EHR database used in this study, as both the mean and median BMI were > 30 for the overall population [9].

Analyses by region demonstrated that, when compared with patients from the South (primarily from Louisiana, Maryland, and the District of Columbia), patients from the Midwest (primarily from Ohio) had higher mean CCI and increased prevalence of hypertension, diabetes, hyperlipidemia, and obesity. In comparison, according to a 2019 report on obesity in the US, Ohio has higher rates of obesity, hypertension, and diabetes than Maryland and the District of Columbia but lower rates than Louisiana [10]. There were differences in insurance plan type between regions; however, no conclusions can be drawn because these data were missing for the majority of patients.

Although data on the role of race in COVID-19 illness and outcomes are limited and complex, several studies have demonstrated that there may be increased risk of infection and worse clinical outcomes among Black patients. The potential underlying reasons for these findings are important to consider and may include socioeconomic and health inequalities, resource deprivation, and pathophysiologic differences in response to infection, but further research is needed [6, 11,12,13,14].

In this study, 55.0% of patients hospitalized with COVID-19 were African American. African American patients are not overrepresented in the Explorys database overall: patients in the South are 16.0% African American and 43.0% White, and patients in the Midwest are 10.9% and 62.9%, respectively. This reflects the US population, which according to the latest Census data is 13.4% Black or African American and 60.4% White (not Hispanic or Latinx) [15].

In the present study, among patients from the South, 60.8% were African American and 27.2% were White, and in the Midwest, 31.0% were African American and 59.5% were White. Increased rates of infection observed in the African American population in this study might be due to the tendency of the COVID-19 epidemic to be more severe in certain geographic areas—in this case, in areas with a high prevalence of African Americans. It also may be due to the disease attacking poorer Americans who have essential jobs and cannot isolate or socially distance. In this study, African American patients were younger and a higher proportion were female compared with White patients; these findings are similar to those of a study of 3626 patients in Louisiana, the majority of whom self-identified as Black non-Hispanic [6]. The mean CCI was higher in African American patients than White patients, although the difference was not statistically significant. In the Louisiana study, a higher mean CCI was also observed in Black patients than White patients hospitalized with COVID-19, and racial differences in CCI have been reported among the general US population [6, 16]. The present study also observed that the proportions of African American patients with diabetes and obesity were higher than those of White patients, which is a finding similar to that in the Louisiana cohort, although that was regardless of hospitalization status. Differences in insurance plan type were observed when patients were stratified by race; however, no conclusions can be drawn because, for the majority of patients, these data were missing.

The implications of the differences in baseline characteristics between Black patients and White patients hospitalized with COVID-19 and pulmonary involvement are not yet clear. However, it appears that obesity and diabetes are risk factors for complications in COVID-19 across all races [4,5,6]. Several large studies have found the prevalence of obesity and diabetes in US patients with COVID-19 to be higher than that of the general US adult population, in which 39.5% of adults are obese and 10.2% have diagnosed diabetes [4,5,6, 10, 17]. The prevalence of these health conditions varies among demographic groups, and there are many complex factors underlying these differences, including historical, socioeconomic, and policy inequities that exert systemic effects on daily life that play many roles in health [10, 17]. In a study of > 10,000 COVID-19-related deaths in England, a multivariable model that adjusted for underlying health conditions found that greater deprivation—a socioeconomic measure including income, employment, education, health, crime, and living environment—was associated with increased risk of death and suggests that societal factors likely play a key role in COVID-19 outcomes [14]. Importantly, further studies are needed on the underlying reasons for racial disparities in COVID-19 and other health emergencies to more robustly explain these patterns, mitigate any potential misinterpretations of analyses based on race, and improve outcomes for patients [12,13,14].

This study has several limitations related to retrospective analysis of EHR data. Medications that patients were taking but that were not prescribed in the baseline period or were prescribed by a healthcare provider outside of the systems that contribute to Explorys were not captured. Patients were also presumed to have filled and taken medications as prescribed, but this cannot be confirmed. Diagnoses that were made by providers outside of the systems that contribute to Explorys were not captured, and conditions for which patients did not seek care were not included. Results may not be generalizable to the US population as a whole, as a majority of the patients were from the South and Midwest. Selection for those populations was not intended, as the selection criteria included nationally used ICD codes; these trends reflect the geographic distribution of the Explorys database and may reflect the nature of the COVID-19 outbreak. These large cohorts of Southern and Midwestern patients are also a strength of this study, because currently the data on baseline characteristics of patients with COVID-19 in these regions are limited. Descriptive statistics were used, and additional analyses are needed to adjust for potential confounding factors. Despite these limitations, this study provides important information from a large real-world population of US patients hospitalized with COVID-19 and pulmonary symptoms.

Conclusions

Among US patients hospitalized with COVID-19 and pulmonary involvement primarily from the South and Midwest, the most common comorbidities were hypertension, diabetes, hyperlipidemia, and obesity. The higher represented demographic groups included patients from the South, African American patients, and female patients. Compared with White patients, African American patients were younger, with higher mean BMI, higher prevalence of concurrent diabetes, and lower prevalence of COPD and smoking/tobacco use. Although this study confirms trends observed elsewhere, it is important to view these data in the context of the complex underlying reasons for racial disparities in COVID-19; further research is needed to understand the implications of these observations.