FormalPara Key Summary Points

Why carry out this study?

After COVID-19 developed into a pandemic with significant morbidity and mortality, both public health and healthcare professionals questioned the extent to which immunocompromising conditions, such as immune-mediated inflammatory diseases (IMIDs) or malignancies, increased patients’ risk of SARS-CoV-2 infections and related adverse outcomes.

While patients in immunocompromised states are generally considered to be at risk for severe COVID-19, the risk has not been evaluated across major IMIDs and malignancies to understand how each individual disease contributes to COVID-19 outcomes using the same data source.

What was learned from the study?

The risk of developing severe COVID-19 varied among each studied IMID and the types of malignancies. Patients with rheumatoid arthritis were at a higher risk than those with other IMIDs for severe COVID-19 after adjusting for demographic and clinical characteristics. Notably, patients with rheumatoid arthritis were significantly older and had a higher Charlson comorbidity index score than the general population of patients with COVID-19.

While it is important to protect all patients with IMIDs or malignancies from exposure to SARS-CoV-2, COVID-19 monitoring and management should be pertinent for each IMID or type of malignancies in addition to other risk factors for severe COVID-19.

Introduction

The coronavirus disease 2019 (COVID-19) has become a disastrous pandemic since severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was first detected in December 2019 [1]. As of March 2022, more than 460 million people across nearly 200 countries were infected [1, 2], and approximately 6 million COVID-19-associated deaths were reported worldwide [2]. In the USA, the total confirmed cases reached over 79.3 million, and roughly 965,336 deaths were attributed to COVID-19, placing COVID-19 as the third leading cause of death in the USA in 2020 [3].

Immunocompromised patients, such as those with immune-mediated inflammatory diseases (IMIDs) or malignancies, may be at higher risk for severe outcomes from COVID-19 because of an underlying weakened immune system and/or immunosuppression [4]. In particular, patients with malignancies or IMIDs have a weakened immune status, often have multiple underlying comorbidities, take immunosuppressive medicines, and are more susceptible to infections [5,6,7,8,9,10]. While several studies have evaluated the risk of severe COVID-19 in patients with malignancies or IMIDs, there is still a lack of understanding of how the risks differ among individual conditions [11,12,13,14]. Earlier studies were often limited by small sample sizes during the early period of the pandemic. As a result, those studies were unable to thoroughly evaluate and compare the outcomes of COVID-19 among patients with individual IMIDs and types of malignancies [15,16,17].

As the COVID-19 pandemic continues, understanding the clinical outcomes of patients with IMIDs or malignancies has become critically important. Utilizing a US nationwide electronic health records (EHR) database collected from routine clinical practices across a national network of healthcare providers, this study aimed to evaluate and compare the risk of severe COVID-19 among patients with selected IMIDs and malignancies. In this study, the evaluated conditions of interest included the individual IMIDs of ankylosing spondylitis (AS), atopic dermatitis (AD), psoriasis (PsO), psoriatic arthritis (PsA), rheumatoid arthritis (RA), Crohn’s disease (CD), ulcerative colitis (UC), systemic lupus erythematosus (SLE), and subgroups of malignancies of solid tumors (ST) and hematologic cancers (HC).

Methods

Data Source

The Optum® de-identified Electronic Health Records (Optum® EHR) database contains longitudinal patient care data from healthcare provider organizations with more than 700 hospitals and 7000 clinics across the USA and captures point-of-care diagnostic data that are specific to COVID-19, including patient-level and clinical results from both inpatient and ambulatory settings [18]. All 50 states in the USA and all types of payors are represented, including Medicare, Medicaid, commercial insurers, and self-pay/uninsured. The Optum® EHR provides near-real-time data on the COVID-19 pandemic. Regardless of the test result, patients with a history of any COVID-19 test or diagnosis are included in the Optum® EHR database. All pertinent available data on patients within the cohort are provided back to 2007, and new patients are added to the cohort at each data refresh. The data include patient demographics, inpatient and outpatient visits, comprehensive laboratory data, medications prescribed and administered, and procedures.

The Optum® EHR database included records for: conducted polymerase chain reaction (PCR) and antigen testing for SARS-CoV-2; the results of the SARS-CoV-2 testing, which were verified by a team of medical terminologists and clinicians via manual review; and clinical diagnosis for COVID-19 and other coronavirus-related infections, which can be used to identify patients with SARS-CoV-2 infection. The Optum® EHR database is certified as de-identified following the Health Insurance Portability and Accountability Act’s (HIPAA) statistical de-identification rules. The database is managed according to the Optum® customer data use agreements and therefore this study was exempt from institutional review board (IRB) approval.

Study Population

Patients with COVID-19 were identified using International Statistical Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) diagnosis codes (U07.1 and U07.2), which were introduced on 1 April 2020, or with a positive laboratory diagnostic test result by a SARS-Cov-2 virus PCR or antigen test [19]. The COVID-19 diagnosis date was defined as the earliest of any positive PCR or antigen test, or diagnosis for COVID-19. The COVID-19 diagnosis dates ranged between 1 February 2020 and 3 March 2021, and the last follow-up date was 28 April 2021.

A minimum of 15 months of active EHRs prior to the index date was required to allow sufficient time to capture patients’ baseline demographics and clinical characteristics. Patients with missing age or sex were excluded. Within this study cohort of all eligible patients with SARS-CoV-2 infection, we created ten subgroups based on their diagnosis for the following conditions of interest within 15 months before the index date of COVID-19 diagnosis: RA, PsA, AS, SLE, PsO, AD, CD, UC, ST, and HC. Only one diagnosis code (in any setting and position) was required, and the conditions of interest were not mutually exclusive. Patients were followed for 3 months from the index date, or until a censoring event, which was defined as death, end of active EHR, or end of data availability (28 April 2021), whichever was earliest.

Study Variables

Demographic characteristics included patient age group (0–17, 18–44, 45–64, 65–74, 75–84, 85+ years), sex, race, ethnicity, geographic region (Midwest, Northeast, South, and West), and calendar quarter (Jan–Mar 2020, Apr–Jun 2020, Jul–Sept 2020, Jan–Mar 2021) of SARS-CoV-2 infection measured as the index date. We identified the following seven underlying at-risk medical conditions, adapted from the Centers for Disease Control and Prevention (CDC) 2019 Novel Coronavirus case report form [20]: cardiovascular disease (CVD), chronic kidney disease, chronic liver disease, chronic lung disease, malignancy, diabetes, and obesity (body mass index [BMI] ≥ 30). Charlson comorbidity index (CCI) score, and underlying at-risk medical conditions were captured during the 15 months prior to the index date (Fig. 1) [21, 22].

Fig. 1
figure 1

Diagram for study design. *COVID-19 diagnosis made between February 2020 and March 2021.Severe COVID-19 defined as either hospitalization within 14 days prior to and 30 days post the COVID-19 diagnosis, or death within 3 months from the COVID-19 diagnosis

Outcomes

The primary outcome was severe COVID-19, which was defined as either hospitalization within the 14 days prior to and 30 days following the COVID-19 diagnosis, or death within 3 months of the COVID-19 diagnosis. Hospitalization and death were analyzed separately as secondary outcomes. Death data were obtained from the Social Security Administration Death Master File, which reports month and year of death but does not report the date; in order to calculate the number of days between diagnosis and death, the date of death was set as the 15th of the month. Hospitalization within the 14 days prior to the COVID-19 diagnosis was also identified because (1) date of death is assigned on the 15th of the month of death and therefore using 14 days prior to diagnosis will balance out misclassification in both directions assuming an uniform distribution; and (2) because of the shortage of testing during the earlier phase of COVID-19 pandemic, many positive test results came back after hospitalization and death. Therefore, including 14 days prior ensured a more complete capture of COVID-19 outcomes.

Statistical Analysis

Patient demographics and clinical characteristics were summarized with descriptive statistics. Mean and standard deviation (SD) were reported for continuous variables, and count (N) and proportion (%) were reported for categorical variables. Age- and sex-standardized risk of death, hospitalization, and severe COVID-19 were calculated for comparison and extrapolation using direct standardization with the 2010 US Census data as the standard population [23]. For each condition of interest, imbalances of covariates with respect to the general population of patients with COVID-19 were adjusted by inverse probability weighting (IPW), and confidence intervals (CI) were calculated by bootstrapping. Adjusted risk ratios (aRR) and 95% CIs were estimated. The covariates used for adjustments included baseline demographics, clinical characteristics, and at-risk underlying conditions. The standardized mean differences (SMDs) between all patients with COVID-19 and each condition of interest were calculated to measure the imbalance between patient groups and the general cohort of patients with COVID-19. An SMD < 0.2 was considered an inconsequential imbalance. The descriptive analysis was conducted using SAS Studio 3.81 (SAS Institute Inc., Cary, NC, USA), and the rest of the analyses were performed using R Studio version 3.6.0 (R Studio, Boston, MA, USA) and R packages generalize and tableone.

Results

Baseline Patient Demographics and Clinical Characteristics

Patients with SARS-CoV-2 Infection

We identified 499,772 eligible patients with SARS-CoV-2 infection between 1 February 2020 and 3 March 2021 (Fig. 2). Overall, the mean (SD) age at cohort entry was 46.9 (SD 20.7) years, with 57.0% female and 72.5% Caucasian. The majority of the SARS-CoV-2 infections were confirmed with laboratory testing by positive PCR (349,055 [69.9%]) or positive antigen test (15,125 [3.0%]), while the remaining cases (27%) were identified solely on the basis of recorded clinical diagnosis (Fig. 3). Vaccination status appeared to be under-captured in this EHR database. Through all available data (28 April 2021), only 28,926 (5.8%) of patients had records for receiving at least one dose of a COVID-19 vaccine while the national coverage was estimated at 57% in May 2021 [24].

Fig. 2
figure 2

Patient selection

Fig. 3
figure 3

Identification of study populations. PCR polymerase chain reaction

Many conditions of interest represented a small proportion of the general cohort of patients with COVID-19, often less than 1%. Patients diagnosed with ST were the largest group, consisting of 42,126 patients (8.4% of the general cohort of patients with COVID-19). In general, patients with any condition of interest tended to be older than the general cohort of patients with COVID-19, while patients with AD (mean age of 34.9 years) were younger than the general cohort. Other patient demographics and clinical characteristics varied between conditions of interest as well (Table 1).

Table 1 Baseline demographics and patient characteristics among patients with COVID-19, by condition of interest

Patients with Severe COVID-19 Outcomes

Among the 499,772 patients with COVID-19 in the study, 67,584 patients (13.5%) developed severe COVID-19 after SARS-CoV-2 infection, including 63,125 hospitalizations and 14,177 deaths. As these were not mutually exclusive events, of the 63,125 hospitalized patients, 9718 (15.4%) died within 3 months from the index date, which accounted for 68.6% of the 14,177 deaths during the follow-up period.

Patients with and without severe COVID-19 outcomes (hospitalization and/or death) had different patient profiles (Table 2). Patients with severe COVID-19 were older (mean age of 63.5 years vs of 44.3 years, SMD = 1.01). They had different demographic backgrounds in race, geographic region, and varied in their calendar time of COVID-19 diagnosis/positive test, although sex distribution was comparable. Baseline health status estimated by CCI score was worse for patients who had severe COVID-19 outcomes when compared to the CCI among patients without severe COVID-19.

Table 2 Baseline demographics and patient characteristics among patients with COVID-19, by outcome

Patients with Underlying At-Risk Conditions

Of all patients with COVID-19, 280,302 (56.1%) had one or more underlying at-risk medical conditions prior to the index date, with CVD as the most common condition (34.9%), closely followed by obesity (34.4%, Fig. 4). Compared with the general cohort of patients with COVID-19, a higher proportion of patients with a condition of interest had underlying at-risk medical conditions. Having four or more comorbid at-risk conditions was more common in all conditions of interest when compared to the general COVID-19 cohort, ranging from 7.3% to 34.9%, compared to 6.2% in the general COVID-19 cohort (Fig. 5).

Fig. 4
figure 4

Frequency (%) of underlying medical conditions among patients diagnosed with COVID-19, by condition of interest. All general population of patients with COVID-19, RA rheumatoid arthritis, PsA psoriatic arthritis, AS ankylosing spondylitis, SLE systemic lupus erythematosus, PsO psoriasis, AD atopic dermatitis, CD Crohn’s disease, UC ulcerative colitis, ST solid tumor, HC hematologic cancers, CVD cardiovascular disease, BMI body mass index

Fig. 5
figure 5

Distribution of total underlying medical conditions among patients diagnosed with COVID-19, by condition of interest. All general population of patients with COVID-19, RA rheumatoid arthritis, PsA psoriatic arthritis, AS ankylosing spondylitis, SLE systemic lupus erythematosus, PsO psoriasis, AD atopic dermatitis, CD Crohn’s disease, UC ulcerative colitis, ST solid tumor, HC hematologic cancers

Severe COVID-19 Outcomes

With age and sex standardization to the 2010 US Census data, the risk of severe COVID-19 among the general cohort of patients with COVID-19 was 9.7%, with a hospitalization risk of 9.2% and mortality risk of 1.7%. By condition of interest, the age- and sex-standardized risk of severe COVID-19 outcomes varied: patients with HC (21.5%), SLE (18.1%), ST (13.1%), and RA (12.1%) had a higher risk than the general COVID-19 cohort; patients with PsA (7.4%), AS (7.2%), and AD (7.0%) had a lower risk than the general cohort; patients with UC (9.5%), CD (9.5%), and PsO (9.1%) had a comparable risk to the general cohort (Fig. 6).

Fig. 6
figure 6

Age- and sex-standardized risk of severe COVID-19 outcomes among patients diagnosed with COVID-19, by condition of interest. All general population of patients with COVID-19, RA rheumatoid arthritis, PsA psoriatic arthritis, AS ankylosing spondylitis, SLE systemic lupus erythematosus, PsO psoriasis, AD atopic dermatitis, CD Crohn’s disease, UC ulcerative colitis, ST solid tumors, HC hematologic cancers

We compared baseline characteristics between patients with each condition of interest and the general COVID-19 cohort. Outcomes were adjusted for demographic factors (i.e., age, sex, race, ethnicity, geographic region) and potential COVID-19 risk factors, including CCI score, BMI, diabetes, CVD, and chronic lung disease. After adjustments, age was still not comparable for patients with RA, SLE, PsA, and HC when compared to the general COVID-19 cohort (Fig. 7). Additionally, BMI and CCI score often showed moderate differences in its distribution between patients with a condition of interest compared to the general COVID-19 cohort.

Fig. 7
figure 7

Standardized mean difference between disease condition and all patients with COVID-19 before and after adjustment. CCI Charlson comorbidity index, BMI body mass index. The adjusted covariates were with age, gender, race, ethnicity, BMI, region of residence, CCI, diabetes, chronic lung disease, and cardiovascular disease. The dashed line represents SMD = 0.2

The risk of severe COVID-19 outcomes varied by condition of interest. Compared to the general COVID-19 cohort, patients with HC (aRR 2.0, 1.8–2.1), RA (aRR 1.2, 1.1–1.3), and ST (aRR 1.1, 1.1–1.1) were estimated to be at a higher risk of severe COVID-19. In contrast, the patients with PsA (aRR 0.8, 0.6–1.0) or AD (aRR 0.8, 0.7–0.9) had a lower risk of severe COVID-19 compared to the general population with COVID-19. Patients with SLE (aRR 1.1, 0.9–1.2), PsO (aRR 1.0, 0.7–1.2), UC (aRR 0.9, 0.8–1.1), CD (aRR 0.9, 0.7–1.0), and AS (aRR 0.8, 0.5–1.0) showed a comparable risk of severe COVID-19 (Fig. 8 and Supplementary Table S1). Separate analyses were conducted for hospitalization and death. With the exception of CD and AS, the risks of hospitalization and death were similar to the risk of severe COVID-19 (Supplementary Fig. S1 and Table S1).

Fig. 8
figure 8

Risk ratios (and 95% CIs) for severe COVID-19 outcomes by condition of interest, relative to all patients with COVID-19 before and after adjustment. Risk ratios were adjusted for age, gender, race, ethnicity, BMI, region of residence, CCI, diabetes, chronic lung disease, and cardiovascular disease

Discussion

This study utilized a nationwide EHR database in the USA to examine patient demographics, clinical characteristics, and severe COVID-19 in a large cohort of patients with COVID-19 between February 2020 and March 2021. This study examined major IMIDs and malignancies of interest separately and then compared them to the general population of patients with COVID-19 within a large cohort in the USA.

We found that patients with RA, ST, or HC were at significantly increased risk for COVID-19-related hospitalization, death, and severe COVID-19 compared to general patients with COVID-19 after statistical adjustments. These findings are consistent with previous studies among patients with RA [25,26,27] and patients with malignancies [11, 28]. However, a large cohort study using the US TriNetX data found that the risk of severe COVID-19, including death and hospitalization, was not significantly different between patients with and without RA [29]. Patients with HC were at a higher risk than those with ST for severe COVID-19 outcomes. In a retrospective case–control study, Wang et al. reported that patients with malignancies and COVID-19 had significantly worse outcomes, especially for patients with HC, such as leukemia or non-Hodgkin’s lymphoma [11].

We also found that patients with UC or CD were not at higher risk for severe COVID-19 than the general population with COVID-19. These findings are similar to previous reports for patients with inflammatory bowel disease (IBD) in Italy and Sweden [30, 31]. Moreover, a multicenter research network study found that patients with IBD and COVID-19 were not at an increased risk for hospitalization or COVID-19-related death compared to patients without IBD (relative risk 0.93; 95% CI 0.68–1.27) [32]. We observed that patients with AD showed a lower risk of being hospitalized or developing severe COVID-19, even after adjusting for age. Current evidence indicates that patients with AD are not at increased risk of hospitalization or death [33, 34]. Keswani et al. found that AD was inversely associated with COVID-19 hospitalization [35]. These results, including the widely different risk ratios for individual conditions, indicate that the risk of severe COVID-19 may vary across different IMIDs. Our study showed that the risk for COVID-19 hospitalization in patients with PsA was lower than that in patients with other IMIDs (e.g., RA, SLE, UC or CD). Similarly, Curtis et al. found that patients with RA or UC had a higher incidence of hospitalization than patients with PsA [36].

After IPW adjustment, more standardized differences for covariates reducing below 0.2 shows that the adjustment attenuated the extent of variability in patient demographics and baseline characteristics between the patients with conditions of interest and the general cohort of patients with COVID-19. However, considerable differences in baseline demographics and patient characteristics still existed. This might reflect the nature of the patient population rather than true bias. For example, the AD population with COVID-19 was noticeably younger than the general COVID-19 population, due to the early age of onset for AD [37]. SLE is more common in young women [38,39,40], which was reflected in the large imbalance in gender distribution in our lupus population with COVID-19. Malignancy generally affects senior populations more than younger ones, and therefore we were not surprised to see the higher mean age among patients with ST and HC who contracted COVID-19. When assessing patient risk, a comprehensive patient profile, including patients’ demographics and other underlying medical conditions, must be considered to make sure that patient outcomes are comparable.

Consistent with existing literature of risk factors associated with severe COVID-19, older age [41, 42], underlying CVD [39, 42], diabetes [42, 43], obesity [39, 44], and high CCI score [45] were more common among patients with conditions of interest experiencing severe COVID-19 than those patients who did not. Differences in COVID-19 diagnosis index periods between those with and without severe COVID-19 might reflect the improvement in patient management during the latter period of the pandemic, and/or potentially associated with the availability of COVID-19 vaccines and diagnostic tests. Vaccinated patients with COVID-19 are less likely to require hospitalization and have disease progression to mechanical ventilation or death compared with unvaccinated patients [46, 47].

A few limitations of this study should be noted as well. First of all, hospitalizations and care external to the healthcare delivery network contributing to the EHR data source may not be captured, and centers contributing to the Optum EHR data may not be fully representative of the general population in the USA. We do not believe this limitation would appear across our conditions of interest differently; therefore, while this limitation might affect the absolute risk assessment for the study outcomes, the impact for the relative risk assessment for severe COVID-19 between conditions of interest and the general population was minimal. Second, COVID-19 vaccination status was not evaluated in this study. It might have influenced the number of patients included in the cohort and their outcomes from its availability in mid-December 2020 when the US Food and Drug Administration (FDA) approved COVID-19 vaccines with emergency use authorization in the USA but the influence might not be substantial given the extremely limited vaccine access the general population had during the study period [48]. As a reference, by 3 March 2021 (i.e., last cohort entry date in this study), a small proportion of 9% (n = 32.5 million) of the US population were fully vaccinated [49, 50]. Because COVID-19 vaccines were not broadly available to the general US population during the study period, COVID-19 vaccine data were extremely under-captured and potentially unreliable in this EHR data source. Many patients received their first COVID-19 vaccine doses at community pharmacies or mass vaccination sites outside of hospitals or clinics. This required patients to self-report vaccine status to their medical care providers. Additionally, due to limitations in data availability, including 5.7% of the study population having less than 3 months of follow-up, death was only assessed during the 3 months after the index date, which may lead to an underestimation. Serious long-term consequences from SARS-CoV-2 infection are still not fully understood [51]. A better understanding of the long-term consequences and implications of COVID-19 in patients with IMIDs or malignancies is urgently needed to guide clinicians in the care of these patients. Lastly, this study was subject to residual confounding from factors such as variants of SARS-CoV-2 and disease severity which was not available in the data source. Another source of potential confounding was the immunosuppressive therapies which may increase the risk and severity of COVID-19 for patients with a condition of interest. The extent of an individual’s immunocompromised state may differ by both disease severity and activity. This also could be potentially further complicated by treatment regimens, as many therapeutic options for IMIDs and malignancies contribute to immunosuppression. However, a few studies reported no association between immunosuppressive treatment (e.g., disease-modifying antirheumatic drugs) and severe COVID-19 outcomes [52,53,54].

One strength of this study is the level of granularity it adds to the current literature on evaluating the risk of severe COVID-19 among patients with major IMIDs by individual condition within a large nationwide cohort. In the current literature, often only a few of these conditions of interest are assessed, often specific conditions are not compared against each other or are compiled together to compare against a general population, since an estimated prevalence of IMIDs collectively ranges between 5% and 10% in the industrial population [29, 53, 55, 56]. The results of this study suggest the risk of severe COVID-19 varies by condition within IMIDs. In addition, the use of large-scale, nationwide COVID-19 data in this study allowed for a more comprehensive assessment of patients diagnosed with COVID-19. In addition to the available COVID-19 diagnosis codes, diagnostic test results were professionally reviewed by the data provider for their reliability to identify positive results. This may have captured additional patients with milder infections who may have not utilized further COVID-19-related medical services. EHR as a data source is payer agnostic, and it is less influenced by changes in eligible patients from societal consequences with COVID-19 epidemics, such as loss of employer-based health insurance and transition to government-sponsored coverage (e.g., Medicaid, Medicare) or to individual market coverage [57].

Conclusion

Using a nationwide EHR database, this study evaluated whether patients with underlying IMIDs or malignancies were more vulnerable to severe COVID-19 outcomes because of the immunosuppressive nature of such conditions. The risk of severe COVID-19 varied by condition. Compared to the general population of all patients diagnosed with COVID-19, patients with RA, ST, and HC were at a higher risk of severe COVID-19 outcomes (i.e., hospitalization or death). These findings highlight the need to protect and monitor patients with specific immunocompromising diseases as part of the strategy to mitigate risks for severe COVID-19 in this ongoing pandemic.