Background

Coronavirus disease 2019 (COVID-19) caused by the SARS-CoV-2 virus, has become a global health threat [1, 2]. Not only its acute phase of disease, but so-called “long-COVID” is also a cause of substantial disease burden [3, 4]. A systematic review reported that 80% of patients developed one or more long-term symptoms and the prevalence of 55 long-term effects of COVID-19 [5].

There is no clear definition of long-COVID so far, however, the National Institute for Health and Care Excellence (NICE) in The UK defined it as “signs and symptoms that develop during or following an infection consistent with covid-19 and which continue for more than four weeks and are not explained by an alternative diagnosis” [6]. This term includes ongoing symptomatic COVID-19, from four to 12 weeks post-infection, and post-COVID-19 syndrome, beyond 12 weeks post-infection [7].

The symptoms of long-COVID are various and often different from the acute phase of COVID-19. Miyazato and colleagues reported that the mean time from COVID-19 symptom onset to the emergence of alopecia was 58.6 days and one of patients presented dysosmia after 92 days after symptom onset [8]. Other symptoms such as general fatigue [9, 10], respiratory symptoms [11, 12], cognitive and mental health disorder [13, 14], and so forth [15, 16] have been reported as long-COVID.

Considering its chronic phase, the disease burden of COVID-19 should be larger than that of other respiratory infections due to length and variety of the symptoms. However, the empirical basis for a quantitative assessment of the disease burden imposed by long-COVID is currently scant.

As already mentioned, COVID-19 is one of the greatest global health crises, of an infectious disease that will eventually become endemic, quantitative evaluations of its disease burden are necessary to appropriately assess the impact of interventions. The burden of Long-COVID-19 should be assessed separately from acute COVID-19 because it has clearly distinct characteristics, as part of the disease burden caused by COVID-19.

Malik and colleagues reported a meta-analysis about post-acute COVID-19 syndrome and the health-related quality of life (HRQoL) [17]. However, their results did not include HRQoL between 0 and 1, as single indicator of health utility. Tran and colleagues investigated the validity of impact tools of long-COVID, and they evaluated the impact of long-COVID quantitatively [18], nevertheless, their main interest is not HRQoL itself but to validate their own tool. Although Tabacof and colleagues also assessed the HRQoL of long-COVID patients [19], they focused on rather each component of EQ-5D and had no control group. Fink and colleagues evaluated the correlation between persistent symptoms of pediatric COVID-19 and HRQoL then the target population was different [20].

As described above, the quantitative evaluation of HRQoL for long-COVID adults as a single indicator of health utility which can easily be applied to more comprehensive study such as cost-effectiveness analysis is still scarce. Our study aims to estimate an important part of the disease burden caused by COVID-19, in order to appreciate the potential impact of interventions against it.

Methods

Settings

We conducted a cross-sectional, retrospective survey in which a self-report questionnaire was mailed in April 2021 with two reminders 2 weeks and 1 month later to eligible participants. Potential participants were recruited from the people who visited the outpatient service of the Disease Control and Prevention Center (DCC) in National Center for Global Health and Medicine (NCGM) between 1st February 2020 and 31st March 2021, in order to obtain pre-donation screening test for COVID-19 convalescent plasmapheresis (Another study named “Collection and antibody measurement of Convalescent plasma foreseeing the use for COVID-19 treatment”). i.e., although the questionnaire survey was conducted in April 2020, all the participants have a documented history of COVID-19 at least eight weeks before they visited the outpatient service. The visitors of the outpatient service were voluntarily recruited and 526 participants were included in the study. Visitors who were younger than 20 years old were excluded from the survey. The minimum time from symptom onset or diagnosis of COVID-19 to the questionnaire survey was 56 days. Participants were requested to complete and return the questionnaire and 457 of 526 (86.9%) participants completely answered the questionnaire and were included in the analysis.

Ethics approval

According to local ethical guidelines, providing responses to the questionnaire was considered as providing participant consent. This study was reviewed and approved by the Ethics Committee of the Center Hospital of the NCGM (NCGM-G-004121-00).

Measures

EQ-5D-3L and EQ-VAS were used as outcome measures. EQ-5D-3L questionnaire comprises the following five dimensions: mobility, self-care, usual activities, pain/discomfort and anxiety/depression and each dimension has three levels: no problems, some problems, and extreme problems. The subject is asked to answer each question, and the decision results into a score between − 0.6 and 1.0, with 0 corresponding to death, and some exceptional health states having negative values, i.e., being considered by the average person as worse than dead. The subject is also asked to answer EQ-VAS questionnaire, a standard vertical 20 cm visual analogue scale, used in recording an individual’s rating of their overall current health-related quality of life, which scale ranges from 100 ("the best imaginable health state" or "the best health state you can imagine") to 0 ("the worst imaginable health state" or "the worst health you can imagine") was also collected.

We collected information about demographics (age, sex, height, weight, smoking, drinking, pregnancy, and past history of diseases), clinical course of the acute phase of COVID-19 infection (day of onset and/or diagnosis, admission status during the acute phase, use of antivirals/systemic steroids, requirement of supplementary oxygen/mechanical ventilation/extracorporeal membrane oxygenation during admission), and symptoms since onset to the questionnaire survey (fever, fatigue, shortness of breath, joint pain, myalgia, chest pain, cough, abdominal pain, dysgeusia, dysosmia, runny nose, red-eye, headache, sputum, sore throat, diarrhoea, nausea, appetite loss, hair loss, depression, loss of concentration, and memory disturbance). All symptoms were recorded based on self-reporting, with their onset date and duration.

We included age, sex, BMI, smoking, drinking, hypertension, diabetes, chronic obstructive lung diseases, malignancy, use of antivirals, use of systemic steroids, admission status, and severe COVID-19 disease during admission (use of mechanical ventilation or extracorporeal membrane oxygenation during admission), according to the definition by a report of national registry data in Japan [21]) as confounding factors.

Statistical analysis

The sample size for the linear regression model was calculated by F test [22]. The F test has numerator and denominator degrees of freedom. The numerator degrees of freedom, u, is the number of coefficients (minus the intercept). In our case, \(u = 12\) however at the time of calculation, we set \(u = 15\). The denominator degrees of freedom, \(v\), is the number of error degrees of freedom:

$$v = n - u - 1$$

This implies.

$$n = v + u + 1.$$

The effect size, \({f}^{2}\), is \({R}^{2}/(1-{R}^{2})\), where \({R}^{2}\) is the coefficient of determination, in other words, the “proportion of variance explained”. We used \({f}^{2}=0.15\) which was recommended by Cohen [22] and set the level of significance at 0.05 and power at 0.80. As a result, we obtained \(v=122.4\) and the required sample size was \(122.4 + 15 + 1 \cong 139\).

Two-sided p values of < 0.05 were considered to show statistical significance. All analyses were conducted by R, version 4.0.5 [23].

Answers were classified into two groups; participants who have no ongoing prolonged symptoms and those who have any ongoing prolonged symptoms. “Ongoing prolonged symptom” was defined as symptoms lasted longer than four weeks from the onset of acute phase of COVID-19 infection (i.e., “long-COVID” condition defined in [6]), and, presented at the time of the survey. We evaluated the average treatment effect of ongoing prolonged symptoms on EQ-VAS, which is a measurement instrument that tries to measure the self-reported health status with the range between 0 and 100. and HRQoL values estimated by the EQ-5D-3L questionnaire [24] using the Japanese value set [25].

We used inverse probability weighting (IPW) method with propensity score which was calculated by multivariate logistic regression model predicting the likelihood of having ongoing prolonged symptoms [26, 27]. The standardized mean difference and variance ratio were used to measure covariate balance, and an absolute standardized difference above 10% and variance ratio over 2.0 was interpreted as a meaningful imbalance [28].

Additionally, we investigated factors associated with low EQ-VAS and EQ-5D-3L index values other than ongoing prolonged symptoms by linear regression model. Multicollinearity was examined by variance inflation factor (VIF) and VIF ≥ 2.5 as an indicator of multicollinearity [29].

Results

The left side of Table 1 shows the basic characteristics of the participants. 457 participants recovered from acute phase of COVID-19 and 108 of them presented at least one ongoing prolonged symptom(s). The proportion of female was larger in “Any symptom” group than that in “No symptom” group. There was no substantial difference between the two groups in terms of their age, medical history, admission status and days from symptom onset/diagnosis to the survey. About a half of participants once admitted to hospitals due to acute phase COVID-19. Crude comparison of EQ-VAS and EQ-5D-3L index showed that “Any symptom” group had lower EQ-VAS and EQ-5D-3L index than the “No symptom” group did (EQ-VAS: 70 vs 85, EQ-5D-3L index: 0.81 vs 1.0, respectively). The right side of Table 1 describes the characteristics of the data after propensity score weighting. 95 of “Any symptom” group and 296 of “No symptom” group were included, and other participants were discarded because of missing items.

Table 1 Characteristics of participants before and after propensity score weighting

Table 2 describes the characteristics of prolonged symptoms. We defined “long-COVID” as the status in which any symptoms attributed to SARS-CoV-2 infection last longer than four weeks in our study, regardless of their continued presence at the time the survey was completed. As such, prolonged symptoms in this study indicate “long-COVID” symptoms as defined in [6]. In total 201 of 457 (44.0%) participants reported at least one symptom longer than four weeks after COVID-19 symptom onset. Among these, 73 (16.0%) reported one symptom, 46 (10.1%) two, 47 (10.3%) three, and 35 (7.7%) four or more symptoms. The most common of these prolonged symptoms was general fatigue, which was reported by 58 of 457 (12.7%) participants. The second most common symptom was alopecia, as 55 of 457 (12.0%) participants experienced worse than usual hair loss.

Table 2 Details of symptoms lasted longer than four weeks in the participants

Figure 1 shows the distribution of propensity scores before and after weighting. Figure 2 shows the balance of covariates before and after weighting. The balance of covariates in both groups improved after IPW weighting. The two groups differed mainly in terms of gender and BMI, which could give rise to confounding factors when comparing their HRQoL measurements. Figure 2 demonstrates that the standardized mean difference in these two factors decreased.

Fig. 1
figure 1

Distribution of propensity score before and after weighting. Red colour represents “No symptom” group and blue colour represents “With symptom” group

Fig. 2
figure 2

Balance of covariates before and after inverse probability weighting. Red squares represent before adjustment and blue circles represent after adjustment

Adjusted EQ-VAS and EQ-5D-3L score comparisons were similar to the unadjusted crude comparisons (Table 3). The ATE of ongoing prolonged symptoms was − 12.9 (95% confidence interval [CI] − 15.9 to − 9.8) on the EQ-VAS, and − 0.11 (95% CI − 0.14 to − 0.09) on the EQ-5D-3L. The differences attributed to the symptoms were larger than the minimally important difference estimated in a previous study (0.048, 95% CI 0.046 to 0.051) [30]. Therefore, prolonged symptoms can be regarded as having clinically significant negative impact on patients’ EQ-VAS and EQ-5D-3L scores.

Table 3 Average treatment effect of ongoing prolonged symptoms on EQ-VAS and EQ-5D-3L index

Table 4 shows the results of linear regression analysis about covariates associated with the EQ-VAS (4a) and EQ-5D-3L (4b). Both analyses showed that ongoing prolonged symptoms substantially influence the EQ-VAS and EQ-5D-3L values. Although male sex and steroid use during admission were associated with not lower EQ-VAS scores, no other variable than having ongoing prolonged symptoms was associated with the EQ-5D-3L scores. In both models, all VIF values were below 2.5.

Table 4 Results of linear regression analysis about (a) EQ-VAS, (b) EQ-5D-3L index score

Discussion

Our results demonstrated that people suffering from the phenomenon we called “long-COVID” showed lower HRQoL. This would be another important aspect of COVID-19 to consider because it implies a heavier disease burden than other influenza like illnesses (ILIs), not only due to its severity but also the characteristics of its chronic phase. In the first place, COVID-19 showed higher case-fatality than other ILIs [31,32,33]. Additionally, it might cause a substantial burden through accumulated mild disease only.

Furthermore, the frequency and the duration of symptoms due to “long-COVID” are also noteworthy. Our results showed that nearly half of the participants who recovered from acute COVID-19 (201/457) experienced any symptoms lasting more than four weeks. As for participants who required supplementary oxygen support, 32 out of 70 (45.7%) presented any symptoms longer than four weeks. The precise duration of such symptoms was not obvious because more than 100 participants reported that their symptoms were still ongoing. Nevertheless, the symptoms attributed to “long-COVID” often continue for several months. Although the HRQoL valuations for participants who had any “long-COVID” symptoms was better than those previously reported during the acute phase of other ILIs in Japan (0.81 vs 0.66, respectively) [34], the HRQoL losses attributable to “long-COVID” should exceed those due to the acute phase of other ILIs because of its duration.

There are several strengths in this study. First, we evaluated the disease burden of long-COVID using standardised HRQoL instruments yielding HRQoL weights, which can be used as inputs for cost-effectiveness analysis with Quality Adjusted Life Years (QALY) as outcome of interest. This characteristic will be beneficial for further research about COVID-19.

Second, we compared the burden of long-COVID symptoms with the “control” participants who have past histories of the acute phase of COVID-19 infection and no ongoing symptoms due to long-COVID. As described in Background, albeit there are a few studies which investigate the association between HRQoL and long-COVID, most of them did not compare HRQoL of people suffering from long-COVID with healthy controls.

Additionally, our results suggest that prevention is more important in COVID-19 countermeasures than other ILIs because effective treatment of “long-COVID” is not clearly established yet [7, 35]. Although there is no doubt that vaccination against SARS-CoV-2 will reduce the risk of fatal and severe COVID-19 [36,37,38], its effectiveness against “long-COVID” is not demonstrated yet. This may provide an additional incentive to prevent SARS-CoV-2 infection even in the absence of known risk factors of severe illness.

As our linear regression models demonstrated, there were no definite factors which have negative influence on HRQoL other than ongoing prolonged symptoms. This suggests that lower HRQoL of long-COVID patients can be attributed to these symptoms, and therefore palliative methods against them would be important. With regard to EQ-VAS, male sex and systemic steroid use during admission showed a positive impact on EQ-VAS values. The positive impact of male sex might be attributed to the finding that female COVID-19 patients experience long-COVID more often than male patients [8]. The effect of steroid use during admission is not clear. If treatment during the acute phase of COVID-19 is associated with milder burden than long-COVID, then even mild cases should be treated with appropriate drugs. The impact of treatment during the acute phase of infection on its chronic phase (long-COVID) is an important challenge to address in future research.

In short, symptoms due to long-COVID may be a cause of low HRQoL. Since long-COVID might be an important contributor to future disease burden, effective countermeasures should be considered. At present, there is no established treatment of long-COVID. In anticipation of therapeutic agents for long-COVID, both pharmaceutical (e.g., vaccination) and non-pharmaceutical (e.g., social distancing) preventive interventions remain important.

There are several limitations in our study. First, since our results are based on the questionnaire survey there are some cognitive biases in participants’ responses. The participants answer the questionnaire at least eight weeks after they visited the outpatient service. Given the circumstances, memory recall of the participants might be affected. However, since this study aims to assess the burden of “ongoing” prolonged symptoms, this kind of influence could be trivial.

Second, the potential participants were enrolled from the visitors of outpatient department at the national center hospital of infectious diseases in Japan, implying the study population tend to have had mild disease in their acute period and are comparatively young. Although this can be regarded as a selection bias, long-COVID in relatively young age groups is a serious issue in society, meriting attention in the current social context.

Third, since the participants of this study voluntarily agreed with answering the questionnaire, they can be regarded as having more interest in their own health than that of the general population in Japan. This volunteer bias might be a cause of overestimation in assessment of their prolonged symptoms. In addition, our data about participants’ symptoms were based on self-reported information and not validated by any healthcare professionals. However, we believe that this will not impair the value of our findings substantially because most symptoms attributable to long-COVID are subjective ones such as fatigue, and they are difficult to be validated objectively even if they are assessed by healthcare professionals.

Fourth, there is possible bias caused by non-responders. We do not know why some of participants did not complete the survey. The disease burden of long-COVID could be under/overestimated although the response rate of our survey was quite high (86.9%).

Fifth, we should be careful about the representativeness of the data when we interpret the results because our survey includes a comparatively small number of participants. However, our sample size calculation supported that the number of participants had a sufficient power to detect differences in HRQoL.

As discussed above, there are several sources of bias and we should take care when interpreting the results, nevertheless, also take note that the impact of these limitations can be regarded comparatively small in this study.

Next, we could not take “new variants” into consideration. The difference in severity, infectiousness, and so forth between such new variants and old ones were already reported [39, 40], however, there is no solid evidence about the frequency and the severity of “long-COVID” symptoms in new variants. This should be the subject of future study.

The statistical model we chose also includes its own limitation. Since we compared EQ-VAS and EQ-5D-3L scores after adjusting participants’ background by IPW method with propensity score, we could include most of the participants in the main analysis. Nevertheless, we had to exclude some of them due to missing items, and these missing values might have some impact on the result. Additionally, variables we collected from the survey was limited, then there might be other factors which we could not take into consideration in this study. These limitations will be future challenges to be addressed. Nevertheless, we can consider our results were robust to some extent because both ATE evaluation and linear regression analysis showed similar results. They both indicate that the symptoms caused by long-COVID might impair our quality of life.

Conclusions

What we call “long-COVID” brings us substantial disease burden in addition to the burden attributed to the acute phase of COVID-19. This additional burden makes the whole disease burden of COVID-19 heavier, making prevention strategies all the more important. The influence of acute phase treatment, vaccination, and variants on “long-COVID” should be examined in the near future.