Introduction

Thyroid cancer is the most common type of endocrine-related cancer [1]. There has been a continuous rise in the incidence of thyroid cancer over the last few decades in various countries worldwide [2]. In 2015, there were 3,528 new cases [3] of thyroid cancer in the UK, with incidence rates projected to rise by 74% between 2014 and 2035; this is equivalent to 11 cases per 100,000 people [4].

Thyroid cancers are divided into four main sub-types. Papillary (PTC) and follicular thyroid cancers (FTC) are commonly grouped as differentiated thyroid cancer (DTC) and account for approximately 94% [5] of thyroid cancers. Anaplastic thyroid cancer (ATC) is the most severe type of thyroid cancer and accounts for about 2% of thyroid cancers. Medullary thyroid cancer (MTC) accounts for the remaining patients diagnosed with thyroid cancer [6].

Treatment depends on the sub-type of thyroid cancer; however, initial therapy often involves a partial or total thyroidectomy. Radioactive iodine therapy and radiotherapy may also be offered to reduce the risks of recurrence [6]. Targeted cancer therapies and chemotherapies may be offered in cases of advanced cancer, or relapsing cancer that is symptomatic [7]. Symptoms associated with relapsing thyroid cancer include neck swelling, neck pain, trouble breathing or swallowing, hoarseness or other voice changes [8]. Many of these treatments are likely to have a substantial impact on health-related quality of life (HRQoL). Many people diagnosed with thyroid cancer are young and need lifetime monitoring and surveillance; there are therefore many negative impacts on HRQoL including anxiety associated with disease recurrence, disruptions to social life and changes to physical appearance [9].

Several of the health technology assessment (HTA) organisations in Europe require an economic evaluation to be submitted as part of the evidence base used to appraise health technologies, with effectiveness expressed in quality adjusted life years (QALYs). QALYs can be used to represent both the quality and quantity of life in each health state described in an economic evaluation, where a health state represents aspects of the condition treated by the health technology e.g. disease-free, progressed disease or post-surgery.

The HRQoL element of the QALY is measured in terms of utility values. HTA organisations, such as the National Institute for Health and Care Excellence (NICE) and the Scottish Medicines Consortium (SMC) stipulate that utility should be measured using the EuroQol questionnaire 5-dimension (EQ-5D) [10]. The EQ-5D is a generic preference-based HRQoL questionnaire. It is completed by the patient population of interest and the responses to the questionnaire are then weighted according to the preferences of the general population. Non-preference-based HRQoL questionnaires reflect the health of the respondents without any adjustments being made for the preferences of health states as valued by the general public or patients. Utility values are derived from responses to the EQ-5D by applying weights that reflect population preferences for a particular health state. Utility values range from 1 (equivalent to perfect health) to 0 (equivalent to death), with negative values reflecting health states worse than death. Utility values are used to estimate QALYs by reflecting the time spent in a given health state (e.g. the QALY for an individual with 10 years of life at a utility of 0.8 would be equivalent to 8 QALYs).

NICE and the SMC prefer to consider utility estimates derived from the EQ-5D as it is a generic questionnaire and the estimates generated can be compared across diseases. The EQ-5D can be used to generate preference-based weighted utility estimates from the perspective of the general public; i.e. from the people who pay for the NHS, as the NHS is funded by UK taxpayers.

There are two versions of the EQ-5D questionnaire, one with three levels (EQ-5D-3L) and one with five levels (EQ-5D-5L). The EQ-5D questionnaire comprises five dimensions (mobility, self-care, usual activities, pain/discomfort and depression/anxiety). The EQ-5D-3L has three response levels for each of the dimensions: no problems, some problems and extreme problems; the EQ-5D-5L has five response levels: no problems, slight problems, moderate problems, severe problems, unable to/extreme problems. NICE and the SMC prefer utility estimates derived using the EQ-5D-3L UK value set, therefore if the EQ-5D-5L is used in a study to collect HRQoL data then estimating utilities by converting the responses to the EQ-5D-3L UK value set is recommended.

In the absence of directly measured EQ-5D-3L values, economists can use utility estimates that have been generated by ‘mapping’ from other generic HRQoL questionnaires to EQ-5D-3L values. This method involves using a dataset containing both EQ-5D-3L responses and the alternative HRQoL questionnaire responses to generate a mathematical relationship between the responses to the two questionnaires. The regression equation generated can be used to convert the responses to the alternative HRQoL questionnaire into EQ-5D-3L utility estimates. Mapping algorithms are designed to be used with individual patient-level data to incorporate some of the uncertainty associated with estimating HRQoL within a cohort. However, mapping using mean values can offer an alternative in the absence of patient-level data [11]. Values from alternative methods of eliciting utility estimates, for example, using time trade-off (TTO) or standard gamble (SG) approaches, may also be used if EQ-5D-3L utility estimates are unavailable. TTO and SG are preference-based and the preferences (utilities) can be derived from patients for their own health or for scenarios, or from members of the general public for scenarios.

To date, EQ-5D-3L utility values sourced directly from patients with thyroid cancer have been largely absent in economic evaluations of treatments for this population. Due to the increasing incidence of thyroid cancer, there has been a rise in the demand for targeted cancer therapies which has led to a growing number of drugs receiving market authorisation for a thyroid cancer indication [12]. The International Society for Pharmacoeconomics and Outcomes Research (ISPOR) guidelines for good practice when selecting HRQoL values for use in economic modelling [11] highlight that systematic reviews are seldom conducted to inform utility values. This systematic review aims to summarise the utility values available to represent the HRQoL of patients with thyroid cancer with the addition of utility estimates based on mapping non-preference-based HRQoL questionnaires. The objectives of this systematic review are (a) to provide a catalogue of utility values that could be used to inform economic evaluations of thyroid cancer treatments and (b) to identify any potential health states for which published utility estimates are unavailable. Comparisons of utility values across health states are made easy with a catalogue and we include some of our observations as part of this review.

Methods

This systematic review follows the Centre for Reviews and Dissemination (CRD) guidance on conducting systematic reviews in healthcare [13].

Search strategy

The search strategy included thyroid cancer terminology and a HRQoL search filter [14] (Supplementary Appendix A) and was adapted according to the specifications of each of the databases utilised. Eight electronic databases were searched: MEDLINE (including MEDLINE In-Process and Other Non-Indexed Citations), EMBASE, Cochrane Database of Systematic Reviews (CDSR), Cochrane Database of Abstracts of Reviews of Effects (DARE), Cochrane HTA database, NHS Economic Evaluation Database (EED), Cost-effectiveness Analysis Registry, Patient-reported outcome and quality-of-life instruments database, as well as the EQ-5D-3L website. The date span of the searches was 1st January 1999 to 6th April 2019. The reference lists of relevant publications and websites were hand-searched to identify additional studies. The results of the searches were uploaded to an Endnote X7.4 library and de-duplicated.

The websites of NICE and the SMC were searched for guidance relating to treating people with thyroid cancer. The purpose of these searches was to identify any utility values included in UK policy documents that had not yet been published in peer-reviewed journals.

Study selection and inclusion criteria

All publications that described patients with thyroid cancer, of any sub-type, were included in the review if utility estimates were reported or if they could be calculated from the reported mean values. The NICE Reference Case [15] stipulates the use of the EQ-5D tool (3 or 5 level versions) as the preferred HRQoL questionnaire, using the EQ-5D-3L value set to weight the questionnaire responses. Therefore, we identified publications that either directly reported EQ-5D utility estimates or reported HRQoL values measured with a questionnaire that could be used to indirectly estimate utility values using published mapping algorithms (i.e. SF-36 and EORTC QLQ-30). A list of the domains and response level on the EQ-5D-3L, EQ-5D-5L, EORTC QLQ-C30 and SF-36 is presented in Supplementary Appendix B. Studies which elicited utility values using TTO or SG methods were also included.

The EQ-5D has a number of valuation sets from different countries as the preferences for the health states assessed by the EQ-5D vary according to nationality. The studies were not restricted by country of origin.

In order for a study that reported HRQoL values measured by SF-36 or EORTC QLQ-30 questionnaires to be included, either (a) the authors must have mapped the values using algorithms that converted the estimates into EQ-5D-3L utility estimates using the UK value set or (b) it must have been possible to do this ourselves from the data presented.

Study selection was performed by a single reviewer (RH). An inclusive strategy was adopted at the title and abstract screening stage meaning that publications that met the inclusion criteria (Table 1), as well as publications for which there was some uncertainty about the inclusion decision based on their title or abstract alone, were retrieved and judged based on their full text. Two reviewers independently assessed eligibility of the full-text papers retrieved. Any disagreements were resolved through discussion, and, if necessary, in consultation with a third reviewer.

Table 1 Inclusion criteria

Data extraction and synthesis

Descriptive and methodological characteristics of the studies were extracted. For example, data fields included: sample characteristic, sub-type of thyroid cancer, country of origin, interventions assessed, study design, some details of the patient groups for which HRQoL data were collected, sample size, method of HRQoL data collection (e.g. generic preference-based questionnaire, TTO), and the health states for which utility values were available or could be calculated.

The results were ‘mapped’ onto the utility scale using published mapping algorithms [16, 17] to convert the scores using alternative HRQoL measurement tools to the EQ-5D-3L UK index. Where HRQoL domains included in the published mapping algorithms were not reported in the published studies they were excluded from the review as the utility mapping could not be completed. Data extraction and the mapping of published values from HRQoL questionnaires to the EQ-5D-3L UK value set were performed by two reviewers (TL and RH) independently and the results were then cross-checked. Discrepancies in the results were then discussed by both reviewers and resolved.

A hierarchy of data extraction was developed for those studies reporting multiple HRQoL estimates. Only one set of utility values was extracted from each study. Directly reported utility values from the EQ-5D took precedence, followed by questionnaire data that could be mapped to the EQ-5D-3L using published algorithms [16, 17] and finally data obtained through the use of validated methodologies such as SG or TTO. If, for example, a study reported EQ-5D and SF-36 values, only the EQ-5D values were extracted.

Data from studies that reported only HRQoL changes from a baseline value were also extracted or mapped to the EQ-5D-3L to give a more complete picture of the evidence available. Change from baseline estimates of utility can be useful in economic modelling to populate a model in the absence of health state utility estimates and to validate model outcomes; however, the baseline characteristics of the study populations must be considered to ensure the change estimate is appropriate. Utility values reported for reference purposes only, such as the HRQoL of the general population, were not extracted from the included studies.

A narrative synthesis was used to summarise the information extracted from the included studies; utility values were grouped according to thyroid cancer sub-type and then by the health states that were most similar to each other. More formal methods of synthesis such as meta-analysis could not be used as multiple estimates for sufficiently similar populations were not available.

Results

The searches of the electronic databases identified 5327 citations (Fig. 1). Following the screening of titles and abstracts, 183 full-text publications were retrieved for a detailed assessment of eligibility. Consideration of the full-text papers led to the exclusion of 142 publications. The remaining 41 publications (33 studies) [1, 9, 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57] formed the final set of evidence included in this systematic review.

Fig. 1
figure 1

PRISMA flow diagram. DTC Differentiated thyroid cancer, GTC general thyroid cancer, MTC medullary thyroid cancer, QoL quality of life

The evidence was reported according to the sub-type of thyroid cancer studied (for example, PTC and FTC were grouped as DTC). There were no publications that described patients specifically with ATC or MTC; however, the general thyroid cancer (GTC) group included patients with several sub-types of thyroid cancer or included patients for whom the sub-type of thyroid cancer was not reported. A summary of the characteristics of all of the included publications is presented in Table 2. The sample characteristic column provides details of the population in which HRQoL is measured.

Table 2 Characteristics of publications that were included in the HRQoL evidence review

In 21 of the included studies (24 publications) [18,19,20,21, 24,25,26,27,28, 30, 33,34,35,36, 38, 42, 44, 46,47,48, 50,51,52, 54], utility values for thyroid cancer came from the prospective collection of HRQoL data; some of these data were collected as part of a clinical trial [48, 50].

Six studies (seven publications) [35, 38, 44, 48, 50, 51, 53] measured HRQoL using the EQ-5D-3L. In 23 studies, HRQoL was measured using either the SF-36 or the EORTC QLQ-30 and therefore mapping to the EQ-5D-3L was required (which comprised of 29 publications) [9, 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34, 39,40,41,42,43, 45, 52, 54,55,56,57]. Mapping was conducted from the SF-36 in 16 studies and from the EORTC QLQ-30 in 7 studies. In one study [56], the mapping had been conducted by the study authors, therefore, we conducted mapping in 22 studies. In the remaining four studies (five publications), SG [37] and TTO [36, 46, 47, 49] techniques were used to estimate utility values. No studies using the HUI in this population were identified.

The reporting of the pain domain of the EORTC QLQ-30 was missing from some studies which meant these studies could not be included in the review; the mapping algorithm includes the pain domain and requires a complete set of domains to be used to estimate utility. Only three [19, 46, 47, 51] of the studies had an entirely UK-based population; this included only one [19] of the mapped studies. The regression equations are based on mapping to the EQ-5D-3L UK value set but, as not all of the HRQoL data were collected in the UK (27 studies with data collected completely outside of the UK), there are some potential inconsistencies in the results. If, for example, there is a tendency for quality-of-life estimates to be higher for a non-UK population, then any evidence used from these populations will generate higher utility estimates when mapped to the EQ-5D-3L UK value set.

Seven studies [39,40,41,42,43,44, 52] included patients with different sub-types of thyroid cancer; only one of these studies [43] provided utility estimates for each of the different sub-types.

Symptoms and comorbidities have a negative impact on HRQoL. Surgery has an initial impact on HRQoL that seems to resolve over time. A ceiling effect, where patients report full-health on the utility scale, with 51% of the population reporting full-health on the EQ-5D, was reported in one study [35]. The utility values are collated in Table 3.

Table 3 Included evidence utility values

Discussion

The purpose of the review was to collate the utility values reported in the published literature and to highlight any populations for which utility estimates were not available. This review highlights the lack of utility values available for patients with ATC, MTC and the more severe form of DTC, radioiodine-refractory disease (RR-DTC). Evidence for patients with RR-DTC is limited to the data collected during the DECISION trial [58]. Evidence for ATC and MTC is also limited to one study [43].

Successful thyroidectomy surgery and radioiodine treatment can return patients to a HRQoL that approximates the average EQ-5D of the UK general population (0.856 mean for the UK population [59]) although, while receiving thyroid cancer treatment, HRQoL is negatively impacted [20, 21, 26, 27, 49].

The lowest estimate of utility for a health state (0.205) comes from patients with low-risk DTC [37] for bilateral recurrent laryngeal nerve injury as a complication of a thyroidectomy procedure. The utility estimates in this publication were generated using the SG method in a study of patients who had not all undergone the procedure nor experienced the complication.

Although ATC is the most severe sub-type of thyroid cancer, the only utility value for patients with ATC is higher than the values estimated for the other sub-types [43]. This utility estimate comes from a study [43] which compares HRQoL across the four most common thyroid cancer sub-types. Without alternative estimates for comparison, the impact of ATC on HRQoL remains uncertain.

The ‘ceiling effect’ of a large proportion of people valuing their health as the highest value that can be recorded on the EQ-5D-3L using the UK value set was observed in the study by Economopoulos [35] and may have also occurred in other studies but was not reported, as it is a common feature of HRQoL measurement with the EQ-5D-3L. Many patients with thyroid cancer do not experience significant symptoms and, after treatment, patients can be in long periods of remission [35]. This ‘ceiling effect’ may be problematic when trying to measure the impact of an intervention as if many patients are already recording the highest utility value (of one) then improvements to their HRQoL regardless of the intervention, cannot be measured. The use of the EQ-5D enables consistency across decisions that have been made in the past and therefore enables a level playing field for any new assessments of utility [11].

The Botella-Carretero [21] study produced results that seem anomalous when compared with the other utility values reported in this review, as they are much lower. This may be related to the withdrawal of the thyroid hormone; however, even the baseline HRQoL, measured before full hormone withdrawal, was substantially lower than any of the other utility estimates identified. Authors of the Schroeder et al. [42] study reported the poor HRQoL of patients being treated with levothyroxine or experiencing thyroid hormone withdrawal; this is in contrast to the studies by Borget et al. [44] and Tagay et al. [31] where a significant HRQoL decrement for these same health states was not demonstrated.

When conducting economic evaluations, judgements must be made about how well the available evidence represents the health states of interest [60]. These judgements are often based on two key elements: (a) does the health state from a published utility value encompass all aspects of the health state that is being considered in the economic evaluation (i.e. is the population in the published study sufficiently similar to the population being assessed) and (b) is the valuation methodology robust enough to produce a reliable estimate of that health state. The first element requires some input from topic specialists to allow the characteristics of the patients in each of the health states to be assessed, the second element relies on an evaluation of the methods used to derive utility values.

In addition to the review, we used mapping algorithms to convert the SF-36 and EORTC QLQ-30 questionnaires to the UK value set of the EQ-5D-3L. Using alternative algorithms would likely generate different results. However, we used algorithms considered to be the most appropriate for generating utility values for use in economic evaluations in the UK. The use of other mapping algorithms based on other HRQoL questionnaires would further broaden this catalogue. It should be noted that mapping algorithms have inherent flaws which make them less preferable for the estimation of utility values than direct measures of utility. In the absence of utility estimates for many of the health states experienced by patients with thyroid cancer, the mapped values presented in this study provide starting estimates.

The utility estimates contained within this review were not all derived from generic preference-based tools. However, when building an economic model or conducting an economic evaluation, it is necessary to attribute a utility value to each health state that is relevant to the population being studied. In the absence of an EQ-5D derived utility value for the population of interest, the use of a proxy value may be necessary. The applicability of the estimates available for each health state included in the evaluation needs to be assessed considering the population from which the estimate was obtained, the country of origin of the utility estimate, and the method used to obtain the estimate. Testing the importance of each of the parameters specified in an economic model using the results generated is important, and it may be preferable to use health state utility values which were obtained through less than perfect methods rather than omit health states completely due to a lack of values. Expert opinion can also be elicited to provide parameter estimates for use in any economic model or economic evaluation; however, the impact of proxy values on the overall results of the model should be evaluated using sensitivity analysis [61].

Authors of published studies with a similar aim to our review (i.e. collating utility values in a specific disease area [62,63,64]) have recommended a set of utility values to be used as a reference case for any economic models built in the same disease area. However, the lack of utility estimates for all sub-types, and the need to use proxy estimates for some health states mean identifying a set of standard values would make economic evaluations more consistent but less flexible in terms of the health states that could be included.

The populations described in the included studies are not all based in the UK, and therefore the appropriateness of mapping HRQoL responses to UK EQ-5D-3L value sets may be questioned. However, including only the studies estimating utility values in a British population would vastly minimise the scope of the catalogue as few health states would be available. Nevertheless, this review provides a catalogue of utility values for patients with thyroid cancer and at the same time offers details of the study methods used to elicit these values, allowing the end-user of the data to make an informed choice about which utility values to include in an economic evaluation. The information presented in the catalogue also enables health economists to quickly select individual publications of interest and choose to use alternative mapping algorithms, such as those to EQ-5D value sets other than the UK, if so desired.

Formal tools to assess the robustness of utility values estimated from a variety of study designs and HRQoL tools are currently not available, largely because the most appropriate estimate is dependent on the nature of the question posed. The most robust source of utility values is likely to be from prospective studies obtaining patient-reported outcomes using generic preference-based questionnaires [35, 38, 44, 48, 50]. However, there are several studies within this review that deviate from this ideal [1, 9, 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34, 36, 37, 39,40,41,42,43, 45,46,47, 49, 51,52,53,54,55,56,57].

Studies that included only data describing a change in HRQoL from baseline were included. When combined with other baseline HRQoL estimates, these data could be useful for economic evaluations if the population from the original study and the modelled health state being considered were sufficiently similar. This study has some drawbacks. Some of the studies included had small sample sizes and some of the utility values presented are derived from mapping based on mean values. Utility values derived from HRQoL tools other than the EQ-5D-3L were mapped to the EQ-5D-3L UK value set in line with NICE’s reference case. However, it is plausible that disease specific HRQoL tools may provide a better insight into the utility experience by patients (e.g. for thyroid cancer patients the EORTC QLQ-C30 data could also be used to measure utility values based on the EORTC QLU-C10D). If available, the use of utilities derived from other HRQoL instruments may be more appropriate than expert opinion.

Conclusions

This review provides published utility values for estimating the HRQoL of patients with thyroid cancer and includes in addition, utility estimates that could be estimated by mapping the SF-36 or the EORTC QLQ-30 scores to the EQ-5D-3L UK value set. There are few utility estimates for ATC and MTC sub-types specifically; however, utility values are available for DTC and for a broader mixed population of people with thyroid cancer, which include patients with different sub-types of thyroid cancer. Utility estimates are available for patients who have been treated with a wide range of thyroid interventions across different disease stages. The utility value estimates presented are, on the whole, consistent with each other and what would be expected from a clinical point of view, with the most severe sub-types having the lowest utility estimates, although based on only a few estimates, and the most invasive interventions having the biggest impact on utility. The available estimates in the catalogue provide a useful resource for health economists as they undertake economic evaluations of interventions for thyroid cancer.