Excess risk and clusters of symptoms after COVID-19 in a large Norwegian cohort

Physical, psychological and cognitive symptoms have been reported as post-acute sequelae for COVID-19 patients but are also common in the general uninfected population. We aimed to calculate the excess risk and identify patterns of 22 symptoms up to 12 months after COVID-19. We followed more than 70,000 adult participants in an ongoing cohort study, the Norwegian Mother, Father and Child Cohort Study (MoBa) during the COVID-19 pandemic. Infected and non-infected participants registered presence of 22 different symptoms in March 2021. One year after infection, 13 of 22 symptoms were associated with SARS-CoV-2 infection, based on relative risks between infected and uninfected subjects. For instance, 17.4% of SARS-CoV-2 infected cohort participants reported fatigue that persist 12 months after infection, compared to new occurrence of fatigue that had lasted less than 12 months in 3.8% of non-infected subjects (excess risk 13.6%). The adjusted relative risk for fatigue was 4.8 (95% CI 3.5–6.7). Two main underlying factors explained 50% of the variance in the 13 symptoms. Brain fog, poor memory, dizziness, heart palpitations, and fatigue had high loadings on the first factor, while shortness-of breath and cough had high loadings on the second factor. Lack of taste and smell showed low to moderate correlation to other symptoms. Anxiety, depression and mood swings were not strongly related to COVID-19. Our results suggest that there are clusters of symptoms after COVID-19 due to different mechanisms and question whether it is meaningful to describe long COVID as one syndrome. Supplementary Information The online version contains supplementary material available at 10.1007/s10654-022-00847-8.


Introduction
In light of the many million people infected by SARS-CoV-2, it is important to understand the long-term physical, psychological and cognitive consequences for infected subjects in a population perspective. How common are the symptoms that persist or occur after infection, how long will they last, and what do they consist of? Since reported symptoms are mostly of a general nature, apart from altered smell and taste, one must take account of the incidence of these complaints in the uninfected population. Recent reviews [1][2][3] of post-acute COVID-19 syndrome or long COVID mostly refer to follow-up studies of patients treated in the specialised health services. These studies are important for detailed understanding of the multi-organ sequelae of COVID-19, but do not represent all infected subjects, do not take account of the incidence of symptoms in the general, uninfected population, and do not subtract symptoms that were present before infection. The reviews show that the design, sampling and outcome measures in follow-up studies are heterogeneous, making meta-analyses difficult [1][2][3]. There is a need for population-based, large cohort studies with long-term follow-up that registers new symptoms both in infected and non-infected subjects.
Our study includes adult cohort participants, most of them aged 35-65 years, from the Norwegian Mother, Father and Child Cohort Study (MoBa) [4]. We compare the proportion of new symptoms for participants with and without COVID-19, calculating excess risks for each symptom. We also show the number of symptoms reported by each participant and examine the symptom profile for participants infected 1-6 or 11-12 months ago. We compare risks for men and women, and for subjects with and without severe disease.
There is no clear consensus of what constitutes the postacute COVID-19 syndrome. Since there are symptoms from several organ systems, such as the nervous and respiratory systems, it is of interest to aggregate the long list of symptoms into clusters that can explain as much variation as possible. We aimed to reduce the complexity of symptoms using an exploratory factor analysis.

Study design
The Norwegian Mother, Father and Child Cohort Study (MoBa) is a population-based pregnancy cohort study conducted by the Norwegian Institute of Public Health. Participants were recruited from all over Norway from 1999 to 2008 [4]. The women consented to participation in 41% of the pregnancies and the cohort now includes 95,000 mothers and 75,000 fathers. Parents and children have been followed with questionnaires and registry linkages with the aim to understand causes of disease. MoBa uses data from The Medical Birth Registry (MBRN), which is a national health registry containing information about all births in Norway. Using each participant's unique national identification number, MoBa was also linked to the National Population Registry, the National Surveillance System for Communicable Diseases (MSIS) [5] and to the Norwegian Immunisation Registry (SYSVAK) [6]. It is mandatory to report cases of COVID-19 to MSIS and vaccinations against SARS-CoV-2 to SYSVAK.

Study population
Since March 2020, about 150,000 adult active cohort participants have been invited to answer electronic questionnaires every 14 days with questions related to COVID-19. The response rates to the questionnaires distributed between March 2020 and March 2021 have been 50-80%. For the current study, eligible participants included all adult cohort members who were invited to answer a questionnaire in March 2021 (n = 139,326) about current symptoms and, if present, the duration of such symptoms. The questions were posed to all cohort participants, regardless of previous COVID-19. We excluded participants who received a COVID-19 diagnosis in February or March 2021 (n = 486), to ensure that all COVID-19 cases in our study sample had been diagnosed at least 4 weeks earlier. Also, vaccination against SARS-CoV-2 was initiated in Norway in late December 2020 and very few COVID-19 cases had been vaccinated during our study period which ended January 31st, 2021. We therefore excluded participants who were registered with any COVID-19 vaccine dose in SYSVAK if the vaccination date preceded or corresponded to the date when symptoms were reported in March 2021 (n = 6844) (Fig. 1).

Exposures
A COVID-19 diagnosis was obtained from registry data (MSIS) based on PCR confirmed SARS-CoV-2 infection. We considered participants registered with a diagnosis before February 1st, 2021 as COVID-19 cases. Multiple registrations were not identified for any of the participants included in the study sample. Based on the two main waves of increased infection rates in Norway, we performed stratified analyses with (i) a COVID-19 diagnosis in spring (Wave-1), including participants who were registered with a diagnosis in the period before May 1st, 2020 (first registration was March 6th), and (ii) a COVID-19 diagnosis in autumn/winter (Wave-2), including those with a diagnosis registered between September 1st, 2020 and January 31st, 2021. The control groups were defined by those who had not received a COVID-19 diagnosis at all during the study period.

Outcomes
A questionnaire was distributed to the participants' mobile phones on March 2nd, 2021. The questionnaire included a list of 22 specific symptoms/signs or diseases (Supplementary Table 1). The list included conditions and ailments that could be grouped as cardiorespiratory, neurocognitive, joint and muscle, or "other" (such as altered smell or taste, and fever). For simplicity, we refer to all listed symptoms/ signs/diseases as "symptoms". All participants were asked  to "check off if you have any of the following conditions/ ailments now". The symptoms were selected based on a list from Centers for Disease Control and Prevention (CDC) in the US [7]. Participants reporting any symptom were also asked about the duration (< 1 month, 1-3 months, 4-6 months, 7-12 months, 13-18 months, > 18 months). Among COVID-19 cases, we considered symptoms to be novel if the reported duration was shorter than the time since acquiring a COVID-19 diagnosis.

Covariates
Vaccination status was obtained from the Norwegian Immunisation Registry (SYSVAK) [12]. From the existing MoBa database, we included variables on the participants' age (continuous, calculated from birth year), gender (defined by cohort member role as mother or father), and as proxy of socioeconomic status, educational level (less than high school, high school, college ≤ 4 years, more than 4 years college). Underlying chronic illness were self-reported in three questionnaires to cohort participants distributed in March and April 2020. The following diseases were reported: asthma or other lung disease, cancer, heart disease, hypertension, diabetes, other disease, or no disease. We grouped participants together if they reported at least one disease (including "other disease") in any of the three questionnaires. From the ongoing data collection, we also included information from January 2021 on current smoking (no/yes; and if yes: occasional/daily) and body mass index (BMI, calculated from height and weight).

Statistical analysis
We estimated associations between COVID-19 status and symptoms reported in March 2021 using log-binomial regression models with heteroscedasticity consistent (robust) standard errors. Associations were reported as excess risk (risk differences, RD) and relative risks (RR). RRs were reported with 95% confidence intervals (CI). We examined associations with unadjusted and adjusted regression models including age and chronic disease. We also examined associations after adjustment for education, BMI and smoking in addition to age and chronic illness. We performed analyses stratified by timing of infection (Wave 1 or 2) while excluding case subjects reporting symptom duration exceeding the time since infection. Accordingly, controls were also included based on recency of symptoms. We excluded controls if their symptoms had lasted more than 12 months when compared with cases in Wave-1 or more than 6 months when compared with cases in Wave-2. We also performed analyses stratified by self-reported severity of symptoms and gender, using any COVID-19 diagnosis before February 1st 2021 as exposure.
We examined bivariate correlations between symptoms using tetrachoric correlations, assuming an underlying bivariate normal distribution of the symptoms. Significance of correlations were indicated by a correlation test inferring Pearson correlation with α = 0.05. Symptom patterns among all COVID-19 cases were derived from exploratory factor analysis using the tetrachoric correlation matrix. To decide the number of factors, we used Horn's Parallel Analysis for factor retention ( Supplementary Fig. 1). We examined both the two-factor and three-factor solutions, Supplementary Table 2. We also explored both direct oblimin (oblique, allowing correlated factors) and varimax (orthogonal) rotations. The two underlying factors identified were similar in the two rotation approaches (Table 5 and Supplementary  Table 3). We omitted rare occurrences (myocarditis, kidney disease) from the factor analysis, as well as symptoms with no increased risk (based on RR) among COVID-19 cases 11-12 months after infection (sleep problems, depression, mood swings, joint pain, muscle pain, hair loss, and fever). Statistical analyses were done in R, version 4.1.0, using packages sandwich, nFactor, polychor, cor.mtest, corrplot, psych.

Missing data
The proportion of missing values in covariates was 11.5% for chronic illness and 3.7% for education level. We performed multiple imputation by chained equations with 20 imputations, using the R package mice. The dataset used for imputation included the following variables: chronic illness, education, age, BMI, smoking, gender, Covid-19 diagnosis, and long-term symptoms. Imputed values did not differ substantially from observed values. For instance, the proportion of chronic illness was 29.3% for imputed values and 29.4% in observed data. For education, the proportions in the four categories were as follows (imputed vs. observed): < High school 6.3 vs. 6

Results
In total, 774 (1.0%) of the 73,727 included cohort participants were infected with SARS-CoV-2 in the study period. Figure 2 shows the number of infected MoBa participants per month. Of these, 170 were infected in March or April 2020 (defined as Wave-1 subjects), and 583 participants infected from September 2020 to January 2021 (Wave-2 subjects). All infected and non-infected participants constitute the study population (Table 1).

Excess and relative risks after 11-12 months (Wave-1 subjects)
After 11-12 months, infected subjects had increased risk of 13 of the 22 symptoms when compared to uninfected subjects, with adjusted RRs ranging from 1.8 to 51.4 ( Table 2). The symptom with highest excess risk (16.6%) 11-12 months after infection was altered smell or taste. The excess risk for this symptom is not much different from the absolute risk, since only 0.3% of the uninfected subjects reported this as a new symptom. This translates to a large adjusted relative risk (51.4, 95% CI: 36.0-73.5). Other symptoms with relatively high excess risks are poor memory (14.6%) and fatigue (13.6%). Psychological symptoms, such as anxiety, depression and mood swings have low excess risk, which is also the case for fever, muscle and joint pain. Reduced lung function (7.4%) and shortness-of breath (9.9%) have higher excess risk, and the adjusted relative risk of experiencing reduced lung function during the past year is as high as 24.9 (95% CI: 14.6-42.7), since very few (0.3%) uninfected subjects report this as a new symptom. The adjusted relative risk for chest pain was lower in complete case analysis (RR 4.2, 95% CI: 1.5-8.9) than in the imputed data analysis (RR 6.7, 95% CI: 3.6-12.7). Other symptoms showed no large changes in complete cases analysis. Relative risks showed no large changes after adjustment for education, BMI, and smoking in addition to age and prior chronic disease (Supplementary Table 4).

Excess and relative risks after 1-6 months (Wave-2 subjects)
The picture is much the same after 1-6 months ( Table 3). The excess risk for altered smell or taste is 21.8%, while it is 11.7% for poor memory, 17.4% for fatigue and 14.2% for shortness-of-breath. Headache (excess risk 8.9%), dizziness (8.0%), muscle or joint pain (both 4.9%) appear to be relatively more common after 1-6 months compared to 11-12 months. Anxiety and depression have low excess risk also after 1-6 months.

Risks according to mild or severe infection
The participants were asked whether they had been almost not ill, moderately ill, very ill or hospitalized during the initial infection with SARS-CoV-2. In Table 4 we include all subjects with COVID-19 (wave 1 and 2 and the few subjects infected between the waves) and compare subjects who reported almost no illness (mild illness) with subjects reporting more severe illness (moderately, very ill, or hospitalized). In general, the prevalence of symptoms is about twice as high for subjects with severe infection. For brain fog, the prevalence is 18.1% for severely ill and 9.2% for mildly ill subjects. For shortness-of-breath the proportions are 19.5% and 6.9%, while the figures for altered smell or taste are 23.5% and 16.0%, respectively.

Risks according to gender
Women who have been infected by SARS-CoV-2 report higher prevalence of heart palpitations than infected men (14.1 vs 4.9%) (Supplementary Table 5). The relative risk is 2.9 (95% CI: 1.6, 5.5) adjusted for age, prior chronic disease, and severity of infection. They also report higher prevalence of brain fog, fatigue, headache, dizziness, poor memory and altered smell or taste.

Number of symptoms
The proportion of Wave-1 subjects who have no symptoms after 11-12 months is 44%, while it is 38% for Wave-2 subjects after 1-6 months. Among uninfected subjects, the proportion without new symptoms during the last 12 months is 79% (Supplementary Table 6). Reporting more than 4 different symptoms was uncommon after both wave 1 and 2.
Correlation between symptoms Figure 3 shows bivariate correlations between the symptoms that are significantly associated to sequelae after COVID-19, for Wave-1 (left part) and Wave-2 subjects (right part). Altered smell or taste was weakly correlated with other symptoms. All other symptoms display one or more significant correlations with other symptoms, and the two figures suggest a similar pattern of symptoms for both waves. Using the correlation matrix from both Wave-1 and Wave-2 together (Supplementary Fig. 1 and Supplementary Table 7), we found that two underlying, latent factors explained 33% and 17% (in total 50%) of the variance in symptoms (Table 5). The correlation between the two factors was 58%. The symptoms that load highest on the first factor are brain fog, poor memory, dizziness, heart palpitations, and fatigue, while shortness-of breath and cough load highest to the second factor.

Discussion
We have calculated the prevalence of symptoms after waves 1 and 2 of SARS-CoV-2 infection in Norway among subjects who have and have not been infected. This allows for measures of association between infection and symptoms, using both risk differences and relative risks. We find that some symptoms were clearly associated to infection with SARS-CoV-2. These symptoms cluster into sets, such as a neurocognitive set (e.g. brain fog, dizziness and poor memory) and a cardiorespiratory set (e.g. shortness-of-breath and cough). Altered smell or taste represent a frequent symptom that is more common in women and more common after severe infection, but seems to have relatively low correlation to other symptoms. The findings support the view that long COVID may be more than one syndrome [8,9]. The excess risks for infected subjects were largest for altered smell or taste, poor memory, fatigue and shortness-of-breath after 11-12 months. The same symptoms had high excess risks after 1-6 months. It is interesting that muscle and joint pain have increased relative risk after 1-6 months, but not after 11-12 months. An important finding is that there is low excess risk for anxiety and depression. Post-illness studies of patients with severe coronavirus infection have found a relatively high prevalence of depression [10]. We found that the risk of depression is higher for subjects who experienced a more severe infection initially compared to mildly infected subjects (adjusted RR 1.7, Table 4). The results suggest that anxiety and depression might be more a consequence of the severity of initial disease and the trauma of hospitalisation, rather than a direct consequence of the viral infection itself.
A unique feature of infection with SARS-CoV-2 is the change in smell and taste [11]. For many patients, these changes disappear shortly after the acute infection, but for a subgroup they remain. We find that 16.6% of Wave-1 subjects report altered smell or taste. This is in accordance with proportions found across other studies [1][2][3]. The mechanisms behind this symptom is unknown. It is interesting that the correlations between altered smell or taste and cognitive symptoms in our study are relatively low, compatible with a suggestion that the main mechanism may be a local infection in olfactory epithelial cells [12], rather than an intracerebral affection.
This study has limitations. If we compare the prevalence of symptoms in Wave-1 and Wave-2 subjects, and make inferences about the duration of symptoms, there are important assumptions. One is that the type of coronavirus may have changed over time, so that the infection, at least theoretically, may have different long-term consequences. The other factor is that testing opportunities were more restricted during the spring of 2020 compared to later months, possibly inducing differential selection for Wave-1 and Wave-2 subjects. These limitations will be overcome as we continue  follow-up in MoBa, since viral sequencing has become prevalent, and the same individuals will be followed. Another limitation of our data is that most COVID-19 cases in our study sample were between 35 and 60 years old. Whether our findings are generalizable to younger and older age groups should be elucidated in other studies. The symptoms in this study are self-reported in questionnaires, which has both advantages and draw-backs. The advantage is that one can ask a large, representative sample simple questions in a design that measures the prevalence, duration and correlational structure of symptoms, as well as the degree of association to prior infection. This type of data provides a supplement to what can be learned from linking population samples to health care registrations, as is done, for instance, in the follow-up of veterans in the US [13]. The draw-back is the lack of detail for each participant. There is a possibility for misclassification for signs or diseases that are not easily recognized by the study participants, like myocarditis and kidney disease, as we did not have objective measurements to complement self-reported data. Similarly, there is a possibility that symptoms like anxiety and depression are underreported, which could potentially mask a larger excess risk among COVID-19 cases than observed in this study. In future research one can select random subgroups from population-based cohorts for in-depth clinical investigation, in parallel with clinical follow-up of the more severely affected patients. This will give a fuller picture of the clinical spectrum of long COVID. A next step for large cohorts will be to understand why some infected subjects develop long COVID and others do not, by performing nested case-control studies that utilise biomaterials, including whole-genome genotyping.
In conclusion, the present design sheds light on the causal link between infection with SARS-CoV-2 and longterm symptoms and suggest distinct clusters of symptoms due to different effects of the virus. The lack of correlations between symptoms question whether the diverse manifestations after infection with SARS-CoV-2 can be classified as one syndrome.   Table 7. Asterisks indicating significant correlations (*** for p < .001; ** for p < .01; * for p < .05) Table 5 Loadings of an exploratory factor analysis of post-acute symptoms among COVID-19 cases (n = 774). The analysis includes symptoms with increased risk among COVID-19 cases 11-12 months after diagnosis. The table shows standardized loadings from the pattern matrix using an oblimin rotation, allowing correlation between factors. The two factors explained 33% and 17% (in total 50%) of the variance in symptoms. a a We omitted rare occurrences (myocarditis, kidney disease) and symptoms with no increased risk (based on adjusted RR) among COVID-19 cases 11-12 months after infection (depression, mood swings, sleep problems, joint pain, muscle pain, fever, and hair loss) Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.