Introduction

The COVID-19 pandemic has had an enormous effect on health and society worldwide. Over six million people have died as of July 2022 [1]. COVID-19 can lead to a variety of sequelae [2, 3], including neurological [4] and cardiac [5] problems, fatigue [6] and mental health effects [7]. The pandemic has had a large economic impact [8] as well as straining healthcare resources [9, 10], diverting care from other areas, worsening outcomes [11, 12], and creating a backlog of patients awaiting treatment [13,14,15].

It is plausible that disruption from the pandemic has led people to reassess their preferences, attitudes and priorities, including for health and healthcare; for example, what trade-offs they would be willing to make between health and quality of life. The possibility implies changes in how people would value health states and instruments measuring health-related quality of life. In particular, it is important to examine the EQ-5D instrument [16], as in many countries, including the UK [17], national EQ-5D value sets are used to allocate healthcare resources according to the public’s preferences. If people’s values for EQ-5D health states have changed due to the COVID-19 pandemic, the implication is that healthcare resources are not being allocated efficiently, a problem which is particularly acute when such resources are scarcer than ever.

If values for EQ-5D health states have changed due to COVID-19, there are also implications for recent and ongoing valuation exercises. Value sets created just before the pandemic may no longer be valid. Alternatively, if the shock to values is transient, value sets created in the present could have a short shelf life. In general, the “life-cycle” of value sets is a neglected area, and there have been calls for further research into it [18]. Investigating whether the COVID-19 health crisis affected people’s health preferences gives insight into whether it may have brought to an end the useful life of existing national EQ-5D value sets.

This paper examines specifically whether and how the onset of the COVID-19 pandemic affected how individuals from the UK valued EQ-5D-5L, the five-level version of the instrument [19], which measures whether people have problems on five dimensions: mobility, self-care, usual activities, pain/discomfort and anxiety/depression. On each dimension, individuals indicate whether they have no, mild, moderate, severe or extreme problems. Our work is a follow-up to Webb et al. [20], who compared survey data collected in 2018 and in 2020, shortly after the onset of the pandemic in the UK. They examined differences between the two time points in how individuals rated two EQ-5D-5L health states, 11111 and 55555, and dead using a visual analogue scale (VAS), as well as a derived value for 55555 on the full health = 1, dead = 0 scale used to calculate quality adjusted life-years (QALYs). In 2020 compared to 2018, ratings for 11111 and dead were lower, whereas ratings for 55555 were higher, both for the original VAS and the 1–0 scale. There were also differential changes according to subgroups such as gender, ethnicity and age.

Webb et al. [20] propose that the COVID-19 pandemic is the most likely cause of differences between 2018 and 2020. This paper seeks to complement and reinforce the previous findings by examining the survey data collected in 2020 more closely and exploiting its richness in terms of the number of questions asked about people’s COVID-19 experiences. In particular, we use responses to two survey questions about people’s perceived risk/worry about infection, and four questions about how people’s lives were affected during the pandemic, to examine whether people whose lives were more affected change their values more, which would lend credence to the supposition that the pandemic is the driver of value change.

This study uses VAS to measure people’s preferences for heath states, whereas EQ-5D value sets are usually created using time trade-off (TTO), possibly augmented with a discrete choice experiment (DCE) [21]. We comment on the relevance of our result for such value sets in Sect. “Strengths and weaknesses”.

Methods

Data collection

Primary data were collected using an online survey. Recruitment took place in two waves: 16th–23rd April 2020 and 4th–15th May 2020. Some individuals responded to both waves, and some only to one. Participants were asked questions about themselves and their experience of COVID-19. They rated two EQ-5D-5L states, 11111 and 55555, as well as dead, using the VAS task illustrated in Fig. 1. The scale ran from 100 = the best health you can imagine to 0 = the worst health you can imagine.

Fig. 1
figure 1

Example of visual analogue scale task

Participants were asked whether they had received a positive diagnosis or test for COVID-19. Those answering no were asked what they considered their chances of becoming infected with COVID-19 were, as well as whether they worried about being infected with COVID-19, both on Likert scales from 1 to 5. In wave 2, only additional questions were asked whether COVID-19 had affected their health and/or quality of life, with possible answers of no, yes-negatively and yes-positively. Participants were asked how often they left their homes to shop, and to get fresh air and exercise, and whether they considered themselves key workers. A full list of the COVID-19-related questions is given in appendix A. The survey did not allow participants to proceed without answering all questions; therefore, there is no missing data in submitted responses. (Although note as specified above, some questions were not asked in wave 1.)

Secondary data on COVID-19 prevalence and deaths on the final day of each recruitment wave (23/4/20 and 15/5/20) at Nomenclature of Territorial Units for Statistics (NUTS) level 1 were drawn from the UK coronavirus dashboardFootnote 1 and linked to each individual. The specific measures were cumulative COVID-19 cases by date of publication and cumulative deaths within 60 days of a positive test by date of death.Footnote 2

Analysis

Differences between wave 1 and 2 characteristics were assessed using Mann–Whitney U tests.

VAS ratings for 55555 on the 100–0 scale were transformed to the full health = 1, dead = 0 scale used for calculating quality-adjusted life-years (QALYs) [22]. This was done using the formula

$${\text{VAS}}_{{{55555}}}^{{{\text{rescaled}}}} = \frac{{{\text{VAS}}_{{{55555}}} - {\text{VAS}}_{{{\text{dead}}}} }}{{{\text{VAS}}_{{{11111}}} - {\text{VAS}}_{{{\text{dead}}}} }}$$

where \({VAS}_{i}\) is the rating of health state \(i\) on the 100–0 scale and \({\text{VAS}}_{{{55555}}}^{{{\text{rescaled}}}}\) is the value of 55555 on the 1–0 scale. The rescaled values of 55555 are dependent on ratings for 11111 and dead. Thus, it is possible for individuals’ rescaled 55555 values to increase in response to COVID-19 while their raw VAS ratings decrease, and vice versa, depending on how their ratings for 11111 and dead also change. The rescaling requires VAS ratings to be logical, i.e. \({\text{VAS}}_{{{11111}}}\) \(>\) and \({\text{VAS}}_{{{11111}}} {\text{ > VAS}}_{{{55555}}}\), and illogical responses were excluded from the analysis.

The VAS task can produce a tail of very low rescaled values for 55555, for example below  – 100, which can have a large influence on the mean [22]. In line with the previous study, rescaled values for 55555 were censored at -1.

Few individuals (N = 17) reported a formal COVID-19 diagnosis or positive test. Due to the low numbers and the fact that they were not shown key questions about their subjective risk of/degree of worry about COVID-19 infection, they were also excluded from the analysis.

Whether subjective risk of COVID-19 infection and worry about COVID-19 infection differed between the waves was assessed using Mann–Whitney U tests.

VAS ratings were analysed using two complementary approaches. Tobit regressions were suitable for analysing the continuous variables of cumulative COVID-19 cases/deaths, as well as allowing for interactions between variables. Multinomial propensity score matching (MNPS) was better able to isolate the causal effects of a given dependent variable, but could not accommodate continuous dependent variables or interaction terms. MNPS also required discarding some information by compressing Likert scale responses from a five- to a three-point scale.

Using pooled wave 1 and wave 2 data, three Tobit regressions were run with VAS ratings of 11111, dead and 55555 as dependent variables and upper limits of 100 and lower limits of 0. A Tobit regression was run with rescaled values of 55555 as the dependent variable and an upper limit of 1 and lower limit of -1 was run. Individuals’ subjective risk of COVID-19 infection, worry about COVID-19 infection and the interaction of the two were included as independent variables, along with cumulative cases and deaths by region and the following controls: age, age2, female, white, left school after minimum age, has degree or equivalent qualification, live alone, retired, employed, whether self-report being in levels 2–5 on each EQ-5D-5L dimension, number of long-term health conditions, self-rated health on a Likert scale from 1 to 5.

Using only wave 2 data, similar sets of Tobit regressions were run which added whether COVID-19 had affected participants’ health and/or quality of life. The regressions also included a variable indicating if someone considered themselves a key worker. Two further sets of Tobit regressions were then run which included first how often individuals left the house to shop, and second how often individuals left the house for fresh air and exercise.

There is potential for co-linearity between many variables of interest as well as controls. For example, correlation was expected between subjective risk of infection and worry about COVID-19. Older individuals, or those with long-term health conditions, may also be more concerned about COVID-19, as well as valuing health differently from younger, healthier individuals. MNPS was used to isolate the causal effect of: subjective risk of COVID-19 infection, worry about COVID-19 infection, whether COVID-19 affected health/quality of life, shopping frequency and exercise frequency. The five level responses for subjective risk and worry were collapsed to three groups: 1–2, 3 and 4–5. Responses for frequency of leaving the house for shopping/exercise were collapsed to: less than weekly, weekly and more than weekly. For each variable of interest, samples were created which were balanced on the same control variables used in the Tobit regressions, as well as on the other variables of interest using the twang package for R [23]. Twang estimates multinomial propensity score weights using gradient boosted models, an iterative machine learning technique which can accommodate non-linearity and interactions among the variables that the researcher seeks to achieve balance on. The estimand, i.e. the causal effect of interest, was the average treatment effect (ATE). The gradient boosting algorithm was run for 5,000 iterations, with the multinomial propensity score weights used for analysis taken from the iteration which minimised differences between treatment groups. Differences were assessed using the mean standardised effect size across all control variables. ATEs were then found by including the propensity score weights in weighted linear regression models.

To test the results’ robustness to the rescaled 55555 threshold of  – 1, the Tobit models were re-run with thresholds of  – 2 and  – 1.5. In addition, the models were re-run using a number of other approaches, which rather than censoring, removed participants with excessively low values for 55555 from the analysis data. Full details are given in appendix C, but in summary the approaches were: excluding participants with low rescaled values of 55555; excluding participants who had a large influence on the mean rescaled value of 55555; excluding participants with a high rate of change of rescaled 55555 values with respect to raw 55555 values.

All analysis was carried out using R.

Results

There were 3021 total responses to the survey, comprising 809 people who responded in wave 1 only, 826 who responded in wave 2 only, and 693 people who participated in both. There were 422 responses (14.0%) excluded for illogical VAS ratings. Only 11 respondents (0.4%) said they had received a positive COVID-19 diagnosis and 6 (0.2%) reported ever having a positive test result, all submitted illogical VAS ratings and were excluded on that basis.

Table 1 summarises the characteristics of both the full (N = 3021) and analysis (N = 2599) samples and Table 2 summarises their responses to COVID-19-related questions. Both samples were similar, although the analysis sample was slightly older (48.3 years vs. 47.7 years) and were less likely to have a long-term health condition (30.2% vs. 32.7%). There were relatively few older people in the analysis sample, with only 4.1% aged over 75, compared to 8.6% of the UK population [24]. Around 20% of wave 2 participants said that COVID-19 had affected their health negatively, while around 5% said its effect was positive. A larger proportion said COVID-19 had affected their quality of life, with almost half reporting a negative impact and 8.3% reporting a positive impact. The modal frequency of leaving the house to shop was weekly, whereas the modal frequency of leaving the house for exercise and fresh air was daily. Almost no one shopped more than daily (6; 0.5%) and few exercised more than daily (67; 5.1%) (Table 2).

Table 1 Participants’ characteristics
Table 2 Responses to COVID-19-related questions

Table 5 shows the results of comparing the demographics of waves 1 and 2. The only significant difference observed was in age, with wave 2 participants being slightly older than wave 1 at 45.7 compared to 47. There were no significant differences between survey waves in subjective risk of COVID-19 infection (Mann–Whitney U p value 0.189) or worry about COVID-19 infection (Mann–Whitney U p value 0.097).

Figure 2 shows histograms of the analysis sample’s VAS responses. A large proportion of respondents rated full health as 100 and dead as 0. Most rescaled 55555 values were positive but low, although many also rated it below 0 (i.e. worse than dead).

Fig. 2
figure 2

Histograms of VAS responses

Table 3 summarises the Tobit model results, with full results including control variables given in Appendix B. With the whole analysis sample, there was no significant main effect of worry about COVID-19 infection on any VAS responses, but there was a significantly positive main effect of subjective risk of COVID-19 infection and a significantly negative interaction with worry for dead and rescaled 55555. There were no significant effects of regional-level COVID-19 cases or deaths. With wave 2 respondents, there were no significant results for COVID-19’s effect on quality of life, but respondents reporting both negative and positive impact of COVID-19 on health rated 55555 significantly higher. There was a similar result for dead, although only the coefficient for positive impact of COVID-19 on health was significant. Frequency of leaving the house for shopping had no significant coefficients. Those who left the house to exercise less than weekly or weekly rated 11111 significantly lower and dead significantly higher compared to never leaving the house for exercise. Those who exercised weekly also rated 55555 significantly higher. Examining all exercise-related coefficients seems to indicate that the effects of exercise frequency on VAS ratings were not linear. For all sets of Tobit regressions using wave 2 data, the effects of subjective risk and worry about COVID-19 infection were similar to results using both waves, although some coefficients no longer achieved statistical significance. COVID-19 cases and deaths had no significant effects.

Table 3 Results from Tobit regressions of visual analogue scale ratings

Table 4 shows how balanced responses were across various variables of interest, both with and without propensity score weights. It can be seen in all cases that MNPS improved balance, although the results for COVID-19’s effect on health is notably worse compared to other variables of interest (1.09, compared to the next worst being 0.268). Table 4 also presents the results of Tobit regression models following MNPS. For dead, the coefficients for subjective risk of and worry about COVID-19 infection had opposite signs, in line with the negative interaction term seen in the Tobit regression models. Subjective risk had a significantly positive effect on dead and 55555 ratings, whereas worry had a significantly negative impact on dead and a positive effect on rescaled 55555. People who reported COVID-19 had affected their health, either positively or negatively, rated dead significantly higher. A similar result was seen for 55555, although only the coefficient for a negative effect was significant. The only significant coefficient for COVID-19’s effect on quality of life was that those reporting a positive impact rated 11111 lower. There were no significant effects for either shopping or exercise frequency.

Table 4 Average treatment effects of COVID-19-related variables on visual analogue scale ratings using matched data

The results of robustness tests of different ways of handling extremely low rescaled 55555 values are given in Appendix C. In most cases, alternative approaches reproduce similar results to those presented in the main body of the paper. However, the significant influence of worry and subjective risk on rescaled 55555 values is not reproduced in around half the robustness analyses. Similarly, the finding that those reporting a negative influence of COVID-19 on quality of life rated 55555 lower was not found in several alternative analyses.

Discussion

There are indications in our results that how individuals value health was influenced by the onset of the COVID-19 pandemic. Yet it is also clear that the reasons/drivers for this finding are complex, and people’s reactions to different aspects of the pandemic are difficult to untangle. For example, several COVID-19-related variables had a significant effect on some health states, but not others, and it is difficult to understand why. There is also evidence that subjective risk of and worry about COVID-19 infection had opposite effects for both dead and rescaled 55555. The differing impact of people’s self-assessed likelihood of catching COVID-19 infection and their worry about infection may be due to the danger the disease poses depending heavily on factors such as age and co-morbidities [25,26,27]. Supporting this hypothesis, worry about COVID-19 infection was significantly correlated with age (correlation 0.08, Pearson’s test p value < 0.001) but subjective risk of infection was not (correlation 0.02, Pearson’s test p value 0.225).

In several instances, those reporting that COVID-19 had affected their health gave significantly different responses compared to those who reported no effect. However, the direction of change was the same whether the effect of COVID-19 was positive or negative. One interpretation is that individuals’ health being affected either way by a pandemic disease which dominated public life caused them to reassess their views on health. Another possibility is that the result is due to a small sample size, as only around 5% of respondents said COVID-19 has positively affected their health. This group also likely differed systematically from the general population, since positive health impacts were likely due to factors such as avoiding an arduous commute and/or an unpleasant workplace. People who said COVID-19 positively affected their health were 81.3% employed/self-employed, compared to 60.2% for other individuals, and were also younger, with an average age of 38.8 compared to 49.8 for other respondents, lending weight to the supposition. While MNPS can theoretically eliminate differences in observed characteristics between groups, the worst balancing performance was seen for COVID-19’s impact on health, largely due to the small number reporting a positive effect.

While there were no significant correlations between VAS ratings and how often people left the house for shopping, there were significant and non-linear effects for exercise in the Tobit regressions in Table 3. These effects are difficult to interpret, due to it being unclear how large an impact the pandemic had on respondents’ behaviour. COVID-19 may have influenced how often people leave the houses for fresh air and exercise in different ways: Many will have gone out less to mitigate risk of infection, yet others may have exercised outside more, e.g. due to previously preferring alternative leisure activities. It should also be noted that although the signs of the exercise coefficients in MNPS also reflected a non-linear effect, they were not significant.

No significant effects were seen for regional COVID-19 cases or deaths. This could be due to people’s experience of the pandemic being driven either by national level reporting, or by cases reported in much smaller geographic areas, with regional variations of less importance.

Examining the magnitudes of the significant effects, there were some large effects for VAS ratings. For example, in the Tobit analysis, those whose health was positively affected by COVID-19 rated dead 14.8 points higher than those who did not, a difference covering around a seventh of the VAS range from 0 to 100. However, differences in rescaled values of 55555 were typically small, with magnitudes between 0.005 and 0.07. This is comparable to the smallest utility decrement in Devlin et al.’s [28] English EQ-5D-5L value set at 0.05. Thus, it may be that any changes to how people value health due to COVID-19 on the full health = 1, dead = 0 scale are relatively small.

Comparing the Tobit and MNPS results, in many cases, they are in agreement in terms of the significance and sign of coefficients. Where this is not the case, it is almost always a significant result using one approach, and an insignificant coefficient with the same sign using the other. Thus, it does not appear that using one approach leads to radically different conclusions than using the other.

It was difficult to determine a causal effect of COVID-19 on health state valuation. It may be that individuals who were more affected by the pandemic also systematically valued health differently from people who were less affected. Including control variables in the Tobit regressions and MNPS analysis mitigates this possibility to some extent. However, they can only control for the influence of observable characteristics. It is plausible that unmeasured personality traits, for example, with risk tolerance, could be associated both with health state valuation and measures of COVID-19’s impact such as subjective risk/worry about infection.

The results presented here represent a stronger (though by no means conclusive) argument for a causal influence of COVID-19 on health-state valuation when viewed as complementary to previous findings in Webb et al. [20]. That work showed significant differences in VAS ratings between before (2018) and during the pandemic (2020), and proposed that the pandemic is the most likely cause. And, that some COVID-19-related variables are significantly correlated in this study with VAS responses gives weight to this proposal.

The findings of the two studies are not always in the same direction. For example, in Webb et al. [20], VAS ratings for dead were lower in 2020 than in 2018, yet here, subjective risk of infection and COVID-19 affecting health lead to higher ratings. This is not necessarily contradictory, however. Some COVID-19-related variables, such as worry about infection, were shown in this study to have a negative effect on VAS ratings for dead, and in the previous study, some subgroups rated dead higher in 2020 than in 2018. As there are complicated interactions between individual characteristics and COVID-19 can affect people indifferent ways, it is not easy to compare the two studies’ results.

This study only presents evidence from the initial phase of the pandemic in the UK. Whether any acute changes to health state valuations persist is an unresolved question. The question will be addressed in part by the ongoing national UK EQ-5D-5L valuation study, in preparation since before the pandemic.Footnote 3 However, in most other countries, national valuation studies are not planned in the near future.

Our analysis sample was not representative of the UK population. In particular, it is difficult to tell how representative our respondents were in terms of the dependent variables of interest. For example, while there are other survey studies measuring risk and/or worry attitudes in the UK at a similar time (e.g. [29,30,31]), none used exactly the same measures. Different question phrasing and scales make it difficult to compare our participants’ attitudes to those reported elsewhere. Thus, it would not be possible to “re-adjust” existing UK value sets using our results. Another reason why such re-adjustment would not be possible is that the analysis sample is not representative in demographic terms, and the Tobit analysis revealed that demographic factors influenced VAS ratings. For example, female respondents rated 11111 higher and 55555 lower than male respondents and white respondents rated dead lower than non-white respondents.

That demographic characteristics affected VAS ratings raises the question as to whether COVID-19 may have had different impacts among different demographic groups. Investigating this possibility would be a useful topic for future research.

Policy implications

It is unclear whether knowledge about acute changes to health state values would have had a meaningful impact on how public money was spent in the first few months of the pandemic [32]. While some retrospective analyses of COVID-19 spending have been conducted [33,34,35,36], it is not clear that measures of cost-effectiveness were a factor in decision-making in the initial crisis stages [37,38,39,40]. Ultimately whether decision-making in an acute crisis should take account of cost-effectiveness measures such as maximising QALY gain per pound spent is a question for policymakers (and by extension the voters who appoint them in a democratic society). Yet if anyone should wish to advocate for a larger role of health economics or health state values in future crises, it is essential to have a firm understanding of, and evidence base for, the validity of health economic techniques in such crises.

We present evidence relating to the COVID-19 pandemic only, and not from any other crisis event which could influence values. Yet that one crisis can influence preferences for health states should be a prompt to investigate to what extent other crises could also affect how people value health.

In the longer term, large-scale EQ-5D valuation studies are resource intensive, so it is impractical to construct new ones in every country in the wake of COVID-19, or after every political, historical or health crisis event that may or may not affect how people think about and value health. Many allocation decisions are long term, such as approving health technologies for use for the foreseeable future. It is not clear that such decisions should be influenced by any short-term preference fluctuations.

Nevertheless, our findings suggest the need to develop standardised, resource-light methods to assess whether and how national EQ-5D value sets still reflect people’s preferences, attitudes and values. Such methods could involve collecting smaller amounts of data than full-scale national value sets, or using less resource intensive methods such as online self-complete surveys. Another approach could be re-analysing existing data (see Webb and Kind [41] for a tentative step in this direction).

Strengths and weaknesses

This study has several strengths. As far as we are aware, this paper and our previous one are the only studies to address whether a large-scale crisis event has affected population health state valuations. Data collection was timely, giving insight into how people valued health shortly after the onset of the COVID-19 pandemic in the UK. We also used complementary analysis techniques: MNPS increased the possibility of identifying causal effects for individual variables, enabling us to disentangle the various effects of different aspects of the pandemic, while the Tobit regressions allowed for continuous dependent variables and interaction terms, as well as exploiting the full variation in the data without compressing responses into coarser categories.

The VAS task asked individuals to give an explicit value for dead, relative to the worst health they can imagine, in contrast to methods such as time trade-off where dead is always assigned a value of 0. This allows us to investigate individuals’ attitudes to health and death, and to what extent people believe any health states are worse than dead. The survey collected several different COVID-19-related variables. This has allowed us to distinguish between different aspects of how the pandemic has affected people.

This study also has several weaknesses. For example, some results were not robust to using different methods to censor or remove extremely low rescaled 55555 values. We valued health states using VAS, whereas national value sets are more usually created using other techniques such as TTO [21]. Thus, it may be that our findings would not have been robust to using a more standard technique. Yet if VAS and TTO both elicit the same underlying health preferences, changes found using one technique should be expected to be found using the other.

The only EQ-5D-5L state for which values on the full health = 1, dead = 0 scale was found was 55555. While this state has theoretical importance as the worst state in the classification system, few people report being in it. Thus, it is difficult to say what impact COVID-19 would have on more commonly experienced health states. Age is a large risk factor in COVID-19 mortality and severe illness [42,43,44], so it is older people where the largest effects might be expected to be seen. However, the survey collected relatively few older respondents, with, for example, only 4.1% of the analysis sample aged 75 or older.

Conclusion

We presented evidence that the COVID-19 pandemic may have affected how people value health, as was also seen in Webb et al. [20]. However, there were differences in what aspects of the pandemic influenced individuals’ values. For example, there were differences between people who did and did not report that COVID-19 had affected their health, but no analogous finding for its effect on quality of life. Future research is required to disentangle the complex situation.

Future research could also investigate the influence on valuation of other important health events, or political/social shocks. It would also be useful to investigate whether any effects on valuation, due to COVID-19 or other events, are permanent or transient.

Finally, our results suggest the need to develop standardised methods to quickly and easily assess whether national EQ-5D value sets still represent national values.