Introduction

The SARS-CoV-2 pandemic represents an unprecedented global challenge. By November 2021, over 247 million confirmed cases of SARS-CoV-2 have been reported and more than 5 million patients have died in association with the coronavirus disease 2019 (COVID-19) [1]. Even with infection rates and numbers of patients hospitalized for COVID-19 decreasing in some countries, the long-term consequences of SARS-CoV-2 infections, often referred to as long COVID syndrome (LCS), represent a growing medical and socioeconomic problem, worldwide [2].

The LCS can affect a wide range of organ systems such as the respiratory system or the nervous system [3]. Commonly observed symptoms include shortness of breath, fatigue, anosmia, muscle weakness or cognitive impairment [3]. However, a broad variety of at least partly unspecific symptoms have been described in the context of LCS and LCS has only been poorly defined to date and systematic data on incidence rates are largely missing [4, 5]. Current data from the United Kingdome and the United States of America indicate that the incidence of the LCS range between 7 and 13.3%, depending on the definition of LCS as well as the length of the follow-up period after initial diagnosis of COVID-19 [4, 5]. The WHO has recently published a clinical case definition of post-COVID-19 syndrome, which also includes a review of several other definitions of LCS/post-COVID syndrome [6]

Risk factors for LCS are widely unclear. In particular, it is only poorly understood if the risks for incidence and severity of LCS correlate with disease severity of acute SARS-CoV-2 infection, warranting a clear definition of risk factors for the development of LCS [2]. In addition, most existing data on LCS are primarily focusing on patients hospitalized for COVID-19, while less severe courses that are treated by general practitioners only are less frequently considered. In the present study, we, therefore, used the Disease Analyzer database (IQVIA), which features diagnoses and basic medical as well as demographic data of outpatients treated in general practices in Germany, to study the prevalence of LCS in Germany and to identify clinical factors associated with its development.

Materials and methods

Study design and database

This retrospective observational study was based on cross-sectional medical record data from the Disease Analyzer database (IQVIA), which compiles diagnoses as well as general medical and demographic data that are anonymously obtained from computer systems of general practitioners and specialists in Germany [7]. The sampling method for the Disease Analyzer database is based on summary statistics from all medical doctors in Germany that are published yearly by the German Medical Association and is defined according to the specialist group, the German federal state, the community size category, and the physicians’ age. The database covers ~ 3% of all outpatient practices in Germany. The sampling methods used to select physicians’ practices have been shown to be appropriate for obtaining a population-representative database of primary and specialized care in Germany [7]. Diagnoses [(according to the International Classification of Diseases, 10th revision (ICD-10)], prescriptions (according to the Anatomical Therapeutic Chemical (ATC) Classification system), and the quality of reported data are constantly monitored by IQVIA.

Study population and outcomes

The analysis included 50,402 patients with a confirmed diagnosis of COVID-19 (ICD-10: U07.1) between March 1, 2020 and March 31, 2021 (index date) from one of 1056 GP practices that routinely send data to the Disease Analyzer database. The study’s primary outcome was the proportion of patients with a documentation of long COVID syndrome (LCS) or a diagnosis suggestive for LCS. Since there was no specific ICD-10 code for LCS during this period of time, LCS was identified based on the original diagnosis text of the physicians (“long COVID syndrome”, “post COVID syndrome, “post COVID complications”). The following ICD-10 diagnoses were additionally used as surrogates for LCS: chronic fatigue (ICD-10: G93.3), abnormalities of breathing (ICD-10: R06), disturbances of smell and taste (ICD-10: R43), malaise and fatigue (ICD-10: R53, disturbances in attention (ICD-10: R41.8). Patients with a diagnosis of one or more of these diagnoses documented within the time period between 90 and 183 days after the diagnosis of COVID-19 were enrolled. Patients with a diagnosis of one or more of these diagnoses within 12 months prior to diagnosis of COVID-19 were excluded.

Statistical analyses

The proportion of patients with LCS was analyzed for the total study population as well as for men, women and four age groups (≤ 30, 31–45, 46–60 and > 60 years). The association between predefined variables and the incidence of LCS was investigated in a multivariable logistic regression model. This model included age, sex, and the following diagnoses documented within 12 months prior to the index date: arterial hypertension (ICD-10: I10), lipid metabolism disorders (ICD-10: E78), obesity (ICD-10: E66), cancer (ICD-10: C00–C99), type 1 diabetes mellitus (ICD-10: E10), type 2 diabetes mellitus (ICD-10: E11, E14), depression (ICD-10: F32, F33), asthma (ICD-10: J45), and chronic obstructive bronchitis or lung disease (ICD-10: J42–J44). In a subgroup of patients with available body mass index (BMI) values documented within 6 months prior to the index date (n = 7732), the association between BMI and LCS was analyzed in a second multivariable logistic regression model. Results from the logistic regression analyses are shown as odds ratios (ORs) and 95% confidence intervals (CI). A p value lower than 0.05 was considered statistically significant. All analyses were performed using SAS 9.4. (Cary, NC: SAS Institute Inc).

Results

Characteristics of study cohort

To identify risk factors for the development of long COVID syndrome (LCS), we performed a retrospective observational study based on cross-sectional medical record data from the Disease Analyzer database (IQVIA), which compiles diagnoses as well as general medical and demographic data obtained anonymously from computer systems of general practitioners in Germany [7]. Of the 50,402 patients with a confirmed SARS-CoV-2 infection (ICD-10: U07.1), 1708 (3.4%) were diagnosed with LCS or one of the related diagnoses (ICD-10: G93.3, R06, R43, R53; Table 1). The average time between the diagnosis of COVID-19 and the diagnosis of LCS was 82 days (SD 28 days). Each patient had a least one diagnosis of LCS or the related diagnoses > 90 days after the initial diagnosis of COVID-19. The mean age of all COVID-19 patients was 48.8 years (SD: 19.3 years. 27,512 (54.5%) of patients were female. Arterial hypertension (n = 12,898, 25.6%) was the most prevalent comorbidity, followed by lipid metabolism disorders (n = 8580, 17.0%), depression (n = 8529, 16.9%), diabetes mellitus type 2 (n = 5060, 10.0%), obesity (n = 4995, 9.90%), and chronic bronchitis or chronic obstructive pulmonary disease (n = 4399, 8.7%).

Table 1 Baseline characteristics of the study sample

Clinical factors associated with the development of long COVID syndrome

To identify independent risk factors for LCS, we performed multivariate logistic regression analyses (Table 2). These analyses revealed that lipid metabolism disorders (OR 1.46, 95% CI 1.28–1.65, p < 0.001) and obesity (OR 1.25 95% CI 1.08–1.44, p = 0.003) displayed a strong association with the development of LCS. Notably, the age group between 46 and 60 years (OR 1.81, 95% CI 1.54–2.13, p < 0.001) was associated with a 1.8-fold higher risk of LCS compared to patients ≤ 30 years. Moreover, the risk for LCS rose gradually with increasing BMI and was highest among patients with a BMI ≥ 35 kg/m2; however, this association was not significant due to the small sample sizes of documented BMI values. Besides these metabolic factors, we identified that female sex (OR 1.33, 95% CI 1.20–1.47, p < 0.001) was significantly associated with the likelihood of being diagnosed with LCS. In addition, pre-existing asthma (OR 1.49, 95% CI 1.28–1.73, p < 0.001), hypertension (OR 1.31, 95% CI 1.15–1.48, p < 0.001), and depression (OR 1.21, 95% CI 1.07–1.37, p = 0.002) turned out as risk factors for the development of LCS. In contrast, pre-existing diabetes mellitus type 1 or 2, ischemic heart disease, or cancer did not influence the development of LCS (Table 2). Finally, we observed differences regarding the development of LCS between female and male COVID-19 patients. As such, obesity had stronger effect in women than in men and a pre-existing cancer diagnosis had a significant effect on the development of LCS in men but not women. In contrast, asthma and depression were significantly associated with LCS in female but not male COVID-19 patients (Table 2).

Table 2 Association between predefined variables and the incidence of long COVID syndrome in patients diagnosed with COVID-19 (multivariate logistic regression model)

Discussion

Our data suggest that lipid metabolism disorders and obesity but not diabetes represent strong age-independent risk factors for LCS. As the pathophysiology of LCS is presently unclear, this finding provides important information about a possible pathophysiological relationship of metabolic risks and the development and severity of LCS. This would support the hypothesis that obesity-related chronic inflammation and immune-metabolic processes promote not only severe clinical courses of acute SARS-CoV-2 infection [8], but also the development of LCS. In this context, it cannot be excluded that in our statistical analysis, there might have been an indirect association between severe courses of COVID-19 and the occurrence of LCS. However, it should be noted that the data source of outpatients with SARS-CoV-2 infection makes it unlikely that severe clinical courses had accumulated in our cohort of LCS patients. Moreover, diabetes or age > 60 years, known risk factors for severe courses of acute COVID-19 [9, 10], were not associated with LCS in our cohort, arguing against a linear concordance between risk-profiles of acute COVID-19 and LCS.

Post-acute sequelae (PAS) in the context of viral respiratory infections do not represent a fundamentally new observation, since PAS were already described as a consequence of other non-persistent viral infections in the pre-COVID era [11]. Of note, recent data suggest that the clinical symptoms, which are now referred to as LCS, likewise occur after infection with seasonal influenza [12]. Interestingly, metabolic factors are also discussed as potential risk factors for short and long-term mortality and morbidity for other viral infections as well [13,14,15], highlighting the general role of metabolic diseases as determinants for patients’ long-term outcome after viral infections. Besides metabolic risk factors, we identified other pre-existing medical conditions such as asthma, arterial hypertension and depression as important risk factors for the development of LCS. Our observation that female sex and patients’ age between 46 and 60 years indicate an increased risk of LCS is consistent with other published data from non-hospitalized [16] or hospitalized cohorts [17] of COVID-19 patients. Sigfrid et al. showed that women under age 50 were up to five times less likely to report feeling recovered and twice as likely to report worse fatigue than men of the same age [17]. A recent study of a cohort of healthcare workers (HCW) made observations that point in a similar direction to our data on a larger and more representative population, showing an OR of 1.6 for HCWs who were overweight and an OR of 3.7 for HCWs who had lung disease [18].

In contrast to previous studies of LCS, which have focused predominantly on specific patient groups and tended to study cohorts cared for at specialized COVID-19 centers, our study features a large cohort of COVID-19 outpatients that are representative for the sociodemographic situation in Germany and other high-income countries. However, we acknowledge some limitations. First, during the study period, LCS represented a novel diagnosis that evolved over time and had not yet been assigned to a specific ICD code. Clear diagnostic criteria as nowadays provided by the WHO were lacking, which may have led to overestimation or underestimation of LCS cases. Second, besides diagnosis of LCS in the original diagnosis text, we included diagnoses suggestive for LCS (e.g., “abnormalities of breathing”) that could also occur independently of COVID-19 and there is no valid information if these symptoms were associated with COVID-19 or not. In contrast, some diagnoses that are also consistent with LCS may not have been sufficiently accounted for. Moreover, data starting from March 2020 were used while LCS has really came to light at the end of 2020, which explains why this diagnosis was documented more often in the last months. Finally, we were unable to include a control group as the diagnosis LCS cannot occur in people who were not diagnosed with COVID-19 previously. Nevertheless, our database of currently more than 50,000 COVID-19 patients is a valuable source to identify risk factors for the development of LCS. The overlap with previously published results [16, 17] strengthens the validity of our results and supports the usability of our database in the context of LCS research.

In summary, since obesity and lipid disorders represent modifiable risk factors, our data suggest that lifestyle and metabolic interventions could be part of future strategies for pandemic preparedness. Moreover, our data clearly support the fact that patients with metabolic diseases should be considered as risk patients in all phases of COVID-19, and therefore, need a close clinical supervision even after overcoming the acute phase of COVID-19.