FormalPara Key Points for Decision Makers

This systematic review and meta-analysis provides community- and choice-based health state utility values (HSUVs) for lung cancer, thereby enhancing the validity and reliability of future economic evaluations.

We show that HSUVs for lung cancer vary by stage and—among the methodologically most comparable studies—by country. Subgroup analyses indicated that, among those with metastatic non-small-cell lung cancer, HSUVs decreased throughout the last year of life and may be lower while undergoing a third or fourth treatment line or when disease progresses.

The presented evidence supports the use of stage-specific and—if available—country-specific HSUVs for lung cancer. In addition, for metastatic non-small-cell lung cancer, adjusting for the lower HSUV in the last year of life may be considered, as may further stratification by treatment line or progression status. If the use of HSUVs for other health states is required, our comprehensive breakdown of study characteristics may help identify suitable studies.

1 Introduction

Lung cancer is the leading cause of cancer-related mortality worldwide [1]. New interventions, such as low-dose computed tomography screening [2] and immunotherapy [3], may reduce this burden.

For policy makers, it is important to weigh the balance between the benefits and costs of such new interventions in an economic evaluation. Economic evaluations often express health benefits in terms of quality-adjusted life-years. This measure adjusts the life-years gained by a new intervention (vs. current practice) for health-related quality of life (HRQoL) using health state utility values (HSUVs). HSUVs are weights ranging from 0 to 1, with 0 representing death and 1 representing full health. In some cases, values < 0 are used to represent health states worse than death.

HSUVs can be elicited using a variety of methods. First, patients can be asked to directly value their own HRQoL. Valuation can be done using the choice-based time trade-off (TTO) or standard gamble (SG) methods or the non-choice-based visual analogue scale. In simple terms, choice-based methods determine what respondents would be willing to give up or risk to avoid living in that health state. Indirect elicitation methods are also available, such as asking patients to complete a generic (i.e. applicable across different diseases) multi-attribute instrument. Examples of such generic instruments are the EuroQoL 5-Dimensions (EQ-5D), the Short-Form Six Dimensions (SF-6D), and the Assessment of Quality of Life (AQoL). Based on their answers, each patient is assigned a health state that has been valued by members of the general public. These pre-determined valuation sets are called the tariff. Another indirect elicitation method is drafting vignettes that describe a patient’s HRQoL and then asking individuals to value these vignettes. Finally, some studies have attempted to convert other HRQoL measures (such as the condition-specific European Organization for Research and Treatment of Cancer Quality of Life Questionnaire) to an existing generic multi-attribute instrument without using a valuation method. This practice is called mapping.

Most international guidelines, including those from the UK National Institute for Health and Care Excellence (NICE), prefer that the HRQoL of actual patients is valued by members of the general public (i.e. community based) using choice-based methods [4,5,6]. For reasons of comparability (e.g. across studies or diseases), the preferred instrument in most guidelines is the EQ-5D [5, 6].

Because of the broad variation in elicitation methods, HSUVs for lung cancer have been reported to vary drastically across the literature [7]. Using different HSUVs can lead to different policies being ranked as cost effective [8]. Therefore, it is important to systematically identify appropriate and high-quality HSUVs for economic evaluations [9].

Although earlier studies attempted to provide an overview of HSUVs for lung cancer, these only included metastatic non-small-cell lung cancer cases [10], were not systematic reviews [7], did not include an overview of study characteristics or a critical appraisal [7, 10], and did not provide a pooled set of methodologically high-quality HSUVs [7, 10]. Therefore, we aimed to provide a current systematic review of HSUVs for all types of lung cancer, including an overview of study characteristics and a critical appraisal, and a pooled set of community- and choice-based HSUVs for use in economic evaluations.

2 Materials and Methods

2.1 Study Protocol

The protocol for this study was prospectively registered in the PROSPERO database (reference number CRD42018081495) [11]. This study was undertaken in concordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement [12]; the Cochrane Handbook for Systematic Reviews [13]; the good practices report by the International Society for Pharmacoeconomics and Outcomes Research, Identification, Review, and Use of Health State Utilities in Cost-Effectiveness Models [9]; a similar technical support document developed for NICE [14]; and recent guidance published in PharmacoEconomics [15].

2.2 Search Strategy

A broad and systematic search was conducted in the Embase, Ovid MEDLINE, Web of Science, Cochrane CENTRAL, Google Scholar, and School of Health and Related Research Health Utility Database (ScHARRHUD) databases on 6 March 2017 and updated on 17 April 2019. In short, synonyms for lung cancer were combined with synonyms for health state utility values; quality of life; different analyses, methods, and instruments suitable for eliciting HSUVs; and different valuation techniques. Conference abstracts, letters, notes, commentaries, and editorials were excluded. The complete syntax is provided in the electronic supplementary material (ESM 1; Methods).

2.3 Study Selection

We used Endnote X9 software to remove duplicates [16]. The first and second authors screened titles and abstracts of all initial references according to a pre-specified algorithm, which was designed to broadly identify studies that may report lung cancer-specific HSUVs elicited using any technique (see the Methods in ESM 1). In short, references were selected when the title or abstract indicated that (1) study results were likely lung cancer specific and (2) HSUVs were measured, or HRQoL was measured using an instrument suitable to elicit HSUVs, or HRQoL scores from another instrument were mapped onto a utility scale, or HRQoL was measured and the use of a valuation method was mentioned, or the study was a cost-utility analysis, or the study was a quality-adjusted-survival study. References included by only one of both reviewers were discussed until reaching consensus. References added after the search update were only screened by the first author.

The full text of selected articles was subsequently screened by the first author according to a second pre-specified algorithm (see the Methods in ESM 1) and discussed with the second author. In short, studies were included for critical appraisal if the full text reported at least one original (i.e. not previously published) lung cancer-specific mean or median HSUV, including a measure of variance. Only studies written in English or Dutch language were considered. Conference abstracts were not considered because these often present only preliminary, incomplete, or non-peer-reviewed data. Secondary literature (e.g. literature reviews and cost-utility analyses that sourced HSUVs from the literature) was excluded but checked for cross-references. Articles selected for full-text screening were also checked for cross-references.

2.4 Data Extraction and Critical Appraisal

A digital data extraction form was developed in Microsoft Excel 2016, piloted on six studies, and subsequently refined. First, study characteristics were extracted for use in a critical appraisal. We developed a customised critical appraisal tool for assessing the relevance and validity of the selected studies, based on HSUV-relevant items from several established tools and good practices reports [9, 14, 17,18,19]. In concordance with most international guidelines, study relevance was deemed high if HRQoL was measured in actual patients, whereas a choice-based method was used by members of the general public to value HSUVs (i.e. elicitation was community- and choice-based) [6]. Studies that scored insufficiently on any of these relevance items were excluded from subsequent analyses. This approach prioritises consistency of the methodology across studies [9].

For the remaining studies, all study characteristics that may affect HSUVs were extracted and summarised. If a single study (or multiple studies using the same data) applied different tariffs to the same HRQoL data, only the analysis that applied the matching tariff was extracted (i.e. the tariff matching the country of participants from whom HRQoL was measured). Similarly, if a single study applied multiple instruments to the same patients, only the most commonly preferred instrument was extracted. In accordance with several international guidelines, including those of NICE, the EQ-5D was preferred, followed by other generic preference-based instruments, and finally any remaining methods [5, 6]. Again, this approach prioritised consistency of methodology across studies. Data were extracted by the first author and subsequently discussed with the second author.

2.5 Meta-Analysis and Statistical Methods

All studies remaining after critical appraisal were included in subsequent analyses, if appropriate. Mean or median HSUVs and standard errors were extracted. If standard errors were not available, they were calculated using available information [13]. If median HSUVs were reported, standard deviations were estimated by dividing the interquartile range by 1.35 [13]. Then, the estimated standard deviation was used to calculate the standard error. For studies that reported HSUVs for a control group of the general population, we formally tested the disutility due to lung cancer using a t test, assuming unequal variances. For mapping studies, we extracted the observed HSUV data, if available.

If necessary, we first pooled mean HSUVs across strata within studies using a fixed-effects model [20, 21]. For studies measuring HSUVs at multiple time points in the same individuals, we only extracted and pooled the HSUV at the time point closest to baseline to avoid violating the assumption of independence of observations [22, 23].

As clinical and study characteristics were expected to vary across studies [7], HSUVs across the different studies were then pooled using a random-effects model [20, 24]. To account for possible differences in HSUVs by stage [7, 25], results were separately pooled for studies reporting HSUVs for all stages, for stages I–II, and for stages III–IV. Differences between the pooled HSUVs for stages I–II and stages III–IV were formally tested using a t test, assuming unequal variances.

The study selection based on our critical appraisal accounts for several potential sources of heterogeneity, including the respondent type (i.e. only patients) [7], the elicitation method (i.e. only indirect), the valuation method (i.e. only community- and choice-based) [7, 25, 26], and the upper bound of the utility scale (i.e. only perfect health) [7]. To account for further sources of heterogeneity, a sensitivity analysis pooled HSUVs only across studies that explicitly used the three-level EQ-5D (EQ-5D-3L) instrument. A second sensitivity analysis included only studies that used the EQ-5D instrument (regardless of the version), while also applying the tariff matching the country of HRQoL respondents [9, 27, 28]. This second sensitivity analysis aimed to provide the methodologically most comparable HSUVs for each available country. We further conducted exploratory subgroup analyses by histology (non-small cell vs. small cell) [7], sex [27], age [27], treatment modality, treatment line, and progression status. Results of the second sensitivity analysis and the different subgroup analyses were not pooled because of the anticipated low numbers of studies within each group.

Meta-analysis was performed in R software version 3.6.1 [29] using the meta [30] and metafor [31] packages. We did not assess the risk of publication bias in a funnel plot, which is recommended in the PRISMA checklist for systematic reviews [12], because this is not meaningful for continuous outcomes in a single group.

3 Results

3.1 Search Strategy and Study Selection

After removing duplicates, our search included 5828 studies. We further identified 13 studies by cross-referencing. After screening the titles and abstracts of all identified studies, we assessed the full text of 458 studies. Of those, 407 studies were excluded for reasons outlined in Fig. 1, leaving 51 studies for inclusion in the critical appraisal.

Fig. 1
figure 1

Flowchart of selection of studies reporting community- and choice-based health state utility values for lung cancer. HRQoL health-related quality of life, HSUV health state utility value, ScHARRHUD School of Health and Related Research Health Utility Database

3.2 Critical Appraisal

The relevance of 27 of the 51 studies was high (see Table 1 in ESM 1) [32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58]. Of these, one study separately analysed two datasets [36], which were treated as separate studies. The remaining 24 studies were excluded from subsequent analyses [59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82]. Among the excluded studies, four did not measure HRQoL in patients [72, 77, 81, 82], nine did not use valuation by members of the general public [59, 60, 63, 69, 71, 72, 74, 81, 82], 11 did not use a choice-based method for valuation [59, 63, 65, 67,68,69,70,71, 74, 76, 81], and nine had missing data on one or more of these items [61, 62, 64, 66, 73, 75, 78,79,80].

Among the included studies, the number of patients included for HSUV analysis ranged from 43 to 2396. Only two of 27 studies clearly stated that missing HRQoL data were imputed or that HRQoL response was complete [37, 47]. Six studies performed multiple HRQoL measurements in the same participants. Two of those studies, which used time-to-death categories, did not report loss to follow-up [57, 58]. These two studies were analysed separately because the time since diagnosis could not be derived. The other four studies with repeated measures all reported loss to follow-up at each evaluated time point [40, 46, 50, 52].

3.3 Study Characteristics

Characteristics of the included studies are provided in Tables 2a–c (see ESM 1). One study included only stage I and/or II cases [47], and 13 studies included only stage III and/or IV cases [35,36,37,38,39, 44, 45, 48, 49, 51, 52, 57, 58]. However, two of these stage III–IV studies stratified HSUVs by time to death [57, 58]. These studies were analysed in a separate subgroup analysis. Five of the 13 studies that included all stages stratified HSUVs by stage [33, 34, 40, 42, 46]. Thus, the main analysis included 13 studies with HSUVs for all stages [32,33,34, 40,41,42,43, 46, 50, 53,54,55,56], six studies with HSUVs for stages I–II [33, 34, 40, 42, 46, 47], and 17 analyses across 16 studies with HSUVs for stage III–IV [33,34,35,36,37,38,39,40, 42, 44,45,46, 48, 49, 51, 52].

Eight of the included studies reported mean time since diagnosis [32, 33, 36, 37, 42, 43, 50, 53], which ranged from 27 days to 2.59 years. Most included studies used the EQ-5D instrument: one study used the AQoL instrument [40] and two studies used the SF-6D instrument [43, 54]. Among EQ-5D studies, six did not specify the version used [32, 37, 45, 48, 49, 56], one used the new five-level EQ-5D (EQ-5D-5L) [55], and 14 used the EQ-5D-3L [33,34,35,36, 38, 39, 41, 42, 44, 46, 47, 50, 51, 53]. The 14 studies that explicitly used the EQ-5D-3L were separately pooled in a sensitivity analysis. All EQ-5D studies and the AQoL study used the TTO method for valuation, whereas the SF-6D studies used the SG method. Only three studies collected data through a personal interview [32, 46, 55]. Most of the studies reported mean HSUVs (one reported median HSUVs [40]).

Of 27 studies, 13 applied the tariff that matched the country of origin of the HRQoL respondents [36, 39,40,41,42, 44,45,46,47, 49, 50, 53, 55]. One of these 13 studies did not use the EQ-5D instrument [40]. The remaining 12 studies, which comprised 13 analyses, were included in a second sensitivity analysis of the methodologically most comparable HSUVs for each country [36, 39, 41, 42, 44,45,46,47, 49, 50, 53, 55].

In total, 12 studies included only non-small-cell lung cancer [33,34,35,36,37,38, 41, 47, 49,50,51,52]. The remaining studies included all lung cancer cases regardless of histology. Of these studies, three provided histology-specific HSUVs [46, 48, 53]. However, one of these studies included only cases with stage IIIb–IV lung cancer [48]. For reasons of comparability across studies, only the remaining two studies were included in a subgroup analysis by histology [46, 53].

The percentage of male patients ranged between 37 and 93%. Five studies provided HSUVs stratified by sex [33, 40, 46, 48, 53]. However, one of these studies only included stage IIIb–IV lung cancer cases [48]. Thus, the remaining four studies were included in a subgroup analysis of HSUVs by sex [33, 40, 46, 53].

The mean or median age of patients ranged between 51 and 70 years. Five studies stratified HSUVs by age [33, 40, 46, 48, 53]. Two of those studies did not provide the number of patients in the different age groups [48, 53]. Of the remaining three studies, which included all stages of lung cancer, two used similar age categories. These two studies were included in a subgroup analysis of HSUVs by age [40, 46].

A total of 13 studies allowed the derivation of treatment-specific HSUVs, either by inclusion criteria or by HSUV stratification [33,34,35,36,37,38,39, 44,45,46,47, 49, 53]. However, only seven of these studies allowed the derivation of HSUVs according to treatment modality (surgery, radiotherapy, chemotherapy, or a combination of those) [33, 37, 39, 44, 46, 47, 49]. Of these seven studies, two included all stages of lung cancer. Because the recommended treatment modality for lung cancer is mainly based on stage, a subgroup analysis of HSUVs by treatment modality was conducted using these two studies [33, 46]. Only one study was identified that reported HSUVs by treatment line [38]. This study was included in a further subgroup analysis by treatment line.

We identified two studies reporting HSUVs by progression status [38, 49]. Both studies included only patients with metastatic non-small-cell lung cancer. These studies were included in a subgroup analysis by progression status.

3.4 Health State Utility Values

Figure 2 provides an overview of HSUVs across all included studies. The pooled HSUV for all stages was 0.68 (95% confidence interval [CI] 0.61–0.75) across 5100 individuals. HSUVs for all stages ranged from 0.51 (95% CI 0.49–0.53) [50] to 0.81 (95% CI 0.78–0.84) [43], indicating the presence of significant heterogeneity (p < 0.01). Most heterogeneity could not be attributed to sampling error (I2 = 99%). For stages I–II, the pooled HSUV was 0.78 (95% CI 0.70–0.86) across 1510 individuals. There was significant heterogeneity across stage I–II studies (p < 0.01; I2 = 92%), as results ranged from 0.62 (95% CI 0.51–0.72) [40] to 0.88 (95% CI 0.86–0.90) [47]. The pooled HSUV for stage III–IV was 0.69 (95% CI 0.65–0.73) across 4703 individuals. The analysis of stage III–IV studies showed significant heterogeneity (p < 0.01; I2 = 98%), with study results ranging from 0.51 (95% CI 0.48–0.54) [51] to 0.85 (95% CI 0.83–0.87) [39]. The difference between the pooled HSUV for stage I–II and stage III–IV was statistically significant (p = 0.02). In a sensitivity analysis, only studies that explicitly used the EQ-5D-3L instrument were pooled (see Fig. 1 in the ESM 1). In this sensitivity analysis, the pooled HSUVs were similar to those in the main analysis.

Fig. 2
figure 2

Pooled results of studies reporting community- and choice-based health state utility values for lung cancer by stage. The size of the symbol representing the effect size in each study is relative to the weight it had in random-effects meta-analysis. Not all studies included both stage I–II and stage III–IV cases. Not all studies that did include all stages stratified by stage. The total number of individuals contributing to the pooled value for all stages was 5100; the total number was 1510 for stages I–II and 4703 for stages III–IV. The difference between the pooled values for stages I–II and III–IV was statistically significant (p = 0.02). Arabic numerals between square brackets next to author names refer to the reference list. CI confidence interval

Figures 3, 4 and 5 show the results of the sensitivity analysis of the 12 methodologically most comparable studies, which excluded non-EQ-5D studies and studies that did not apply the tariff matching the country of HRQoL respondents. All of these studies used TTO for valuation. For all stages, mean HSUVs ranged from 0.51 (95% CI 0.49–0.53) in Spain [50] to 0.78 in the USA (95% CI 0.77–0.79) [46] and Canada (95% CI 0.74–0.82) [42] (see Fig. 3). For stages I–II, results ranged from 0.78 (95% CI 0.74–0.82) for Canada [42] to 0.88 (95% CI 0.86–0.90) for Denmark [47] (see Fig. 4). For stage III–IV, the range was 0.61 (95% CI 0.59–0.63) for a study in the UK [36] to 0.85 (95% CI 0.83–0.87) in Germany [39] (see Fig. 5).

Fig. 3
figure 3

Results of sensitivity analysis including only the methodologically most comparable studies reporting community- and choice-based health state utility values for all stages of lung cancer. Studies included in this sensitivity analysis used the EQ-5D instrument and applied the tariff matching the country of responding patients. Pooling results for this sensitivity analysis using a random-effects model was not possible because of the small number of studies within subgroups. The size of the symbol representing the effect size in each study is relative to the weight it would have in fixed-effects meta-analysis (i.e. relative to the inverse of its variance). Arabic numerals between square brackets next to the author names refer to the reference list. CI confidence interval, UK United Kingdom, US United States of America

Fig. 4
figure 4

Results of sensitivity analysis including only the methodologically most comparable studies reporting community- and choice-based health state utility values for stage I–II lung cancer. Studies included in this sensitivity analysis used the EQ-5D instrument and applied the tariff matching the country of responding patients. Pooling results for this sensitivity analysis using a random-effects model was not possible because of the small number of studies within subgroups. The size of the symbol representing the effect size in each study is relative to the weight it would have in fixed-effects meta-analysis (i.e. relative to the inverse of its variance). Arabic numerals between square brackets next to author names refer to the reference list. CI confidence interval, US United States of America

Fig. 5
figure 5

Results of sensitivity analysis including only the methodologically most comparable studies reporting community- and choice-based health state utility values for stage III–IV lung cancer. Studies included in this sensitivity analysis used the EQ-5D instrument and applied the tariff matching the country of responding patients. Pooling results for this sensitivity analysis using a random-effects model was not possible because of the small number of studies within subgroups. The size of the symbol representing the effect size in each study is relative to the weight it would have in fixed-effects meta-analysis (i.e. relative to the inverse of its variance). Arabic numerals between square brackets next to author names refer to the reference list. CI confidence interval, UK United Kingdom, US United States of America

Among the two studies reporting HSUVs for patients with metastatic non-small-cell lung cancer by time to death [57, 58], HSUVs decreased consistently throughout the last year of life (see Fig. 6). HSUVs ranged from 0.83 (95% CI 0.82–0.85) at ≥ 360 days from death to 0.56 (95% CI 0.46–0.66) at < 30 days from death. Both studies were based in the USA and used the EQ-5D instrument with TTO valuation.

Fig. 6
figure 6

Results of studies reporting community- and choice-based health state utility values for lung cancer by time to death. Patients could contribute to multiple time-to-death categories. Therefore, an overall pooled result could not be provided. The size of the symbol representing the effect size in each study is relative to the weight it would have in fixed-effects meta-analysis (i.e. relative to the inverse of its variance). Arabic numerals between square brackets next to the author names refer to the reference list. CI confidence interval, TTD time to death, expressed in days

Results for the subgroup analysis by histology are shown in Fig. 2 in the ESM 1. The included studies both used the EQ-5D instrument with TTO valuation [46, 53]. The HSUV for non-small-cell lung cancer was similar in the US-based study by Tramontano et al. [46] and the Canadian study by O’Kane et al. [53]. In the US-based study, the HSUV for non-small-cell lung cancer (0.78; 95% CI 0.77–0.79) was marginally higher than that for small-cell lung cancer (0.76; 95% CI 0.74–0.78). In the smaller Canadian study, there was a more substantial difference in HSUV between non-small-cell lung cancer (0.77; 95% CI 0.76–0.79) and the HSUV for small-cell lung cancer (0.63; 95% CI 0.56–0.70).

As shown in Fig. 3 in the ESM 1, HSUVs for men did not differ substantially across the four studies included in the subgroup analysis by sex [33, 40, 46, 53]. HSUVs for men ranged from 0.72 (95% CI 0.66–0.78) in the Australian study by Manser et al. [40], which applied the AQoL instrument with TTO valuation, to 0.78 (95% CI 0.77–0.79) in the US-based study by Tramontano et al. [46], which applied the EQ-5D instrument with TTO valuation. In three of these studies, the HSUV for men was similar to that for women, which ranged from 0.73 (95% CI 0.69–0.77) in the study by Grutters et al. [33], which applied the EQ-5D instrument to Dutch patients using the UK TTO valuation set, to 0.77 (95% CI 0.76–0.78) in the US-based study by Tramontano et al. [46]. However, the Australian study by Manser et al. [40] reported substantially lower HSUV for women (0.52; 95% CI 0.44–0.60).

Results for the subgroup analysis by age are shown in Fig. 4 in the ESM 1. In both age groups, HSUVs were higher in the US-based study by Tramontano et al. [46], which applied the EQ-5D instrument with TTO valuation, compared with the Australian study by Manser et al. [40], which applied the AQoL instrument with TTO valuation. In both of the included studies, the HSUV for patients aged < 65 years was marginally lower than that for those aged > 65 years. For example, in the US-based study, the HSUV for those aged < 65 years was 0.76 (95% CI 0.75–0.77) compared with 0.80 (95% CI 0.79–0.81) for those aged > 65 years [46].

Figure 5 in the ESM 1 shows the results for the subgroup analysis by treatment modality. In the Dutch study by Grutters et al. [33], which used the EQ-5D instrument with the UK TTO valuation set, HSUVs ranged from 0.62 (95% CI 0.51–0.73) among those receiving only radiotherapy to 0.86 (95% CI 0.76–0.96) among those receiving surgery with radiotherapy. In the US-based study by Tramontano et al. [46], which also applied the EQ-5D instrument with TTO valuation, HSUVs ranged from 0.72 (95% CI 0.67–0.77) among those receiving surgery and radiotherapy to 0.81 (95% CI 0.80–0.82) among those receiving only surgery.

HSUVs by treatment line are shown in Fig. 6 in the ESM 1. Only one study was included in this subgroup analysis [38]. This study applied the EQ-5D instrument to a multinational selection of patients with metastatic non-small-cell lung cancer and applied the UK TTO tariff. The HSUV was 0.70 (95% CI 0.66–0.74) for the first treatment line, 0.73 (95% CI 0.67–0.78) for the second treatment line, and 0.57 (95% CI 0.47–0.66) for the third and fourth treatment lines.

Figure 7 in the ESM 1 shows the results for the subgroup analysis of HSUVs by progression status [38, 49]. Both studies included patients with metastatic non-small-cell lung cancer and used the EQ-5D instrument. The multinational study by Chouaid et al. [38] applied the UK TTO tariff to all patients, whereas the Thai study by Limwattananon et al. [49] applied the matching Thai TTO tariff. In both studies, the HSUV for the ‘progression free’ health state was similar: 0.70 (95% CI 0.66–0.74) in the study by Chouaid et al. [38] compared with 0.68 (95% CI 0.62–0.74) in the study by Limwattananon et al. [49]. In the study by Chouaid et al. [38], the HSUV for the ‘progressive’ health state (0.58; 95% CI 0.50–0.66) was substantially lower than the HSUV for the ‘progression-free’ health state (0.70; 95% CI 0.66–0.74). This was also the case for the study by Limwattananon et al. [49], although the 95% CI for the ‘progressive disease’ health state was wide.

Finally, Table 3 in ESM 1 shows the results for the two studies that included a control group of members of the general population [45, 56]. Both studies applied the EQ-5D instrument with TTO valuation. The difference in HSUV between lung cancer cases and controls (i.e. disutility) was 0.11 (95% CI 0.05–0.17) in Thailand [45] and 0.27 (95% CI 0.18–0.36) in the study applying the UK tariff to HRQoL data from US patients [56]. In both studies, the disutility due to lung cancer was statistically significant (p < 0.01).

4 Discussion

To our knowledge, we are the first to provide a systematic review and meta-analysis of community- and choice-based HSUVs across all stages of lung cancer. Our pooled results show that the mean HSUV across the literature for stage I–II lung cancer (0.78; 95% CI 0.70–0.86) is statistically significantly higher than the mean HSUV for stage III–IV lung cancer (0.69; 95% CI 0.65–0.73). This makes sense, as stage I–II lung cancer can often be treated with curative intent, whereas metastatic disease (stage III–IV) often requires ongoing palliative treatment with chemotherapy and/or radiotherapy [83]. The pooled HSUV for all stages (0.68; 95% CI 0.61–0.75) was close to that for stages III–IV, which is likely because lung cancer is most often diagnosed at stage IV [84].

While these pooled stage-specific HSUVs provide an overall mean HSUV across the literature, significant heterogeneity was present in all three stage groups, and this could not be explained by sampling error. In our sensitivity analysis that included only the methodologically most comparable studies, the most important study characteristics were the same (i.e. respondent type, stage of disease, elicitation method, instrument, valuation method, valuation population, and upper bound of the utility scale). Furthermore, these studies applied the tariff that matched the country of responding patients, which further reduced potential heterogeneity. Among these studies, stage-specific HSUVs strongly differed by country (and thus by tariff). Such studies were only identified for eight countries: Canada, China, Spain, the UK, the USA, Denmark, Germany, and Thailand. If stage-specific HSUVs provide sufficient granularity, authors of future economic evaluations of lung cancer interventions conducted in one of these eight countries may consider using HSUVs from the corresponding study identified in this sensitivity analysis. For example, a study seeking to investigate the cost effectiveness of lung cancer screening in the USA could use the stage-specific HSUVs from the study by Tramontano et al. [46]. However, for most countries, no such studies were identified. In addition, some authors may prioritise maximising the use of available data over selecting one methodologically optimal study. In both cases, our pooled analysis may provide the best available stage-specific HSUVs.

For some economic evaluations, stage-specific HSUVs may not provide sufficient granularity. For example, further stratification of HSUVs for metastatic lung cancer may be sought by treatment line or progression status. Subgroup analyses indicated that HSUVs for patients with metastatic non-small-cell lung cancer may indeed be lower among those with progressed disease and those undergoing a third or fourth line of treatment. Further exploratory subgroup analyses by histology, sex, age, and treatment modality did not provide unambiguous evidence for differences in HSUVs by these variables. For example, there were differences in HSUVs across treatment modalities within studies. However, the recommended and provided treatment modalities for lung cancer are mainly based on stage [85], which may partly explain these differences. In addition, results were inconsistent across studies. For example, receiving surgery with radiotherapy was associated with the lowest HSUV in one study but with the highest HSUV in another study. In general, few studies were available with the required level of granularity for each of the conducted subgroup analyses, reflecting the need for more high-quality research. The lack of clear evidence regarding the effect of histology, sex, age, and treatment modality on HSUVs provides additional support for our suggestion to use stage-specific (and, if available, country-specific) HSUVs, if possible. Still, if authors of economic evaluations require HSUVs for other health states, Tables 2a–c in the ESM 1 provide a comprehensive breakdown of patient characteristics, methodological characteristics, and the stratification variables used in each of the included studies. These tables may be used to identify specific studies meeting the needs of such analyses.

We only identified two relevant studies that included a matched control group. In these studies, the disutility due to lung cancer was 0.11 (95% CI 0.05–0.17) and 0.27 (95% CI 0.18–0.36), respectively. For comparison, the minimally important difference in EQ-5D HSUVs (defined as the smallest change that is perceived by patients as beneficial or that would result in a change in treatment) has been estimated at 0.06 for the USA and 0.08 for the UK [44, 86]. It is important that more future HSUV studies include an adequately matched control group of members of the general population. Otherwise, the disutility due to lung cancer could be overestimated, as members of the general public do not have perfect health [27, 56].

4.1 Strengths and Limitations

A major strength of our study is the inclusion of both small-cell and non-small-cell lung cancer, regardless of stage, whereas a previous review included only advanced non-small-cell lung cancer [10]. Our search strategy, which was constructed in collaboration with an information specialist, was also a major strength. We screened almost 6000 abstracts and over 450 full-text articles, identifying 51 peer-reviewed studies reporting original HSUVs. Through this search strategy, we identified a broader range of relevant studies than did two earlier reviews. The first, which was not a systematic review, screened 147 abstracts, yielding 22 studies [7]. The second screened 1832 abstracts, yielding 34 inclusions, of which 16 appeared to be non-peer-reviewed conference abstracts (for some of these abstracts, we identified and included the full study). In addition, we included a thorough assessment of study characteristics, relevance, and validity, which allowed us to focus on comparable studies presenting the preferred community- and choice-based HSUVs. In contrast, the two previous reviews included studies regardless of quality and methodology, including expert opinions [7, 10].

The large number of identified studies and the assessment of study characteristics enabled us to select the methodologically most comparable community- and choice-based HSUV studies. Therefore, we could control for the most important factors that may affect HSUVs without relying on meta-regression, which can be prone to false-positive associations [87]. Nevertheless, heterogeneity remained present across the identified studies. These differences may be due to additional factors that we were not able to fully control for.

First, the time of measurement relative to diagnosis or treatment may influence HSUVs [25, 28]. Unfortunately, we could not account for this possible effect in our main analysis. Many of the included studies in our meta-analysis did not report the mean time between diagnosis and HSUV measurement. Also, while 4 of 27 studies measured HSUVs at multiple time points in the same patients, we could only include a single time point in our main analysis to avoid violating the assumption of independent observations. For those studies, we included the observation closest to baseline to limit the variability of time points across studies. Despite these limitations, the subgroup analysis by time to death showed that HSUVs for metastatic non-small-cell lung cancer tended to decrease during the last year of life. In particular, HSUVs had decreased by approximately one-third by the last month of life. A possible way to adjust for this effect in economic evaluations is to proportionally adjust the chosen HSUV for metastatic disease during the last phase of life.

Second, it can be difficult to disentangle the effects of some variables, even when comparing methodologically similar studies. For example, one of the studies in our meta-analysis reported HSUVs for two UK-based trials [36]. Both trials measured HRQoL in patients with stage III–IV non-small-cell lung cancer using the EQ-5D instrument and used the UK TTO tariff for valuation. However, the mean HSUV was 0.61 (95% CI 0.59–0.63) in the first trial and 0.75 (95% CI 0.71–0.79) in the second trial. The mean age of participants was 77 years in the first trial and 62 years in the second trial. Also, participants in the first trial received erlotinib or placebo, whereas patients in the second trial received radiotherapy and chemotherapy. Therefore, both age and treatment may have driven these markedly different HSUVs. Unfortunately, reporting and stratification of HSUVs was inconsistent across studies in our meta-analysis, which limited our ability to disentangle such effects.

5 Conclusions

The presented evidence supports the use of stage-specific HSUVs for lung cancer. In addition, it supports the use of country-specific HSUVs. However, stage-specific HSUVs were not available for many countries. Therefore, our pooled HSUVs may provide the best available stage-specific HSUVs for most countries. For metastatic non-small-cell lung cancer, adjusting for the decreasing HSUVs in the last year of life may be considered. Based on a limited number of studies, further stratification of HSUVs for metastatic non-small-cell lung cancer by treatment line or progression status may also be considered. Little evidence exists to support the use of histology-, sex-, age-, or treatment modality-specific HSUVs. Still, if HSUVs for other health states are required, our comprehensive breakdown of study characteristics can help identify suitable studies.