1 Introduction

Neuroendocrine neoplasms (NEN) occur throughout the body, the most common sites being pulmonary and digestive. NEN range from well-differentiated neuroendocrine tumours (NET) to poorly-differentiated carcinomas termed neuroendocrine carcinomas (NEC) [2]. They have varying growth characteristics: low grade and indolent to high grade and aggressive. NEN arising in the gastrointestinal tract are termed gastroenteropancreatic neuroendocrine tumours (GEP NETs).

The move from International Classification of Diseases (ICD) 10 to ICD-0-3 resulted in a more accurate classification of NEN. A further update to grade and stage classification in 2018 gave more reliability across users and systems [36]. As a result of these improvements the histological nomenclature and classification of NEN is starting to more accurately reflect the true incidence and prevalence of NEN and its subsets.

NEN incidence is rising [12, 19]. NEN incidence in England is 9.37 per 100,000 according to the UK’s National Cancer Registration and Analysis Service(NCRAS) [62]. Rising incidence of these types of tumour is as yet unexplained but be real, or may relate to increased diagnosis [22] or improved classification systems. In 2017, NEN prevalence in England was 48 per 100,000 [62]. This is greater than most other upper gastrointestinal cancers (e.g. the incidence of gastric cancer stood at 29.6 per 100,000 in 2010) [6]. In the United States, annual age-adjusted incidence of NETs was 1.09 per 100,000 in 1973 and increased to 6.98 per 100,000 by 2012[12]. Increasing incidence of NEN has also been observed in Spain [18], France [33] and Italy[5].

The heterogeneous clinical presentation and biology of NEN with often vague abdominal symptoms can present a challenge in diagnosis and management. Delayed diagnosis, which increases health care costs in a range of diseases [17, 24, 39], is likely to be factor in NEN. Patients with NEN are often initially misdiagnosed or can experience delay to diagnosis of up to five years [4]. High resource utilisation before diagnosis [53] and an expensive diagnostic process [23] contribute to increased overall costs in this phase. An advanced stage at diagnosis confers significantly poorer outcomes of NENs compared with non-NENs at the same anatomical site [19].

Treatment options comprise surgery, targeted therapies, peptide receptor radionuclide therapy (PRRT), trans-arterial chemoembolization (TACE), thermal ablation of liver lesions in the form of radiofrequency ablation (RFA) or microwave ablation (MWA), chemotherapy and liver transplant. The mainstay of treatment in GEP-NETs is long term long-acting somatostatin analogues [60]. Treatment in NEN is also long-term with surgery for advanced GEP-NETs unlikely to be curative [60]. There is a lack of evidence on the best way to investigate and diagnose a NEN in the face of innovative, costly investigation and treatment. Therapy for NEN is reported by patients to vary internationally, and there has been criticism about disparities in access to treatments across countries [35]. A multidisciplinary team approach is important for the optimal treatment of patients with GEP NETs [30, 59].

The total cost of managing illness in NEN is an important figure for healthcare commissioners to achieve an adequate and fair distribution of limited resources across healthcare systems. Some studies have estimated the cost using combinations of physician surveys [8] and registry linkage [23, 34] but results are not necessarily applicable to the UK due to differences in pricing of treatments, health systems and economies. Cost of care in cancer in the UK has been calculated [32]. Decisions about which costs fall within and outside the scope and perspective depend on the decision-maker and can have a significant impact on the resulting cost of illness [25]. In terms of cost-effectiveness, quality-adjusted life years (QALYs) are often the measure of health benefit preferred, and these require appropriate estimation of utility values (QALY weights). The preferred measure of health-related quality of life by the National Institute for Clinical Excellence (NICE) in adults has been EQ-5D since 2008 [42].

Pharmaceutical pricing is the single largest factor impacting the cost of care in the form of long acting somatostatin analogues and targeted therapies [3, 23, 46]. A large registry-linked study found the pharmaceutical cost burden formed 42% of the yearly total cost of managing NEN [34]. In the maintenance phase of illness (more than a year after diagnosis), per patient costs in NEN were found to be triple of those in colon cancer due primarily to pharmaceutical costs [23].

There is a high cost of biochemical testing of blood and urine in NEN due to measurement of serum chromogranins and urine 5-Hydroxyindoleacetic acid. Radiological monitoring post-diagnosis and treatment for NEN is also costly (Table 1). Most patients will need computed tomography (CT) at diagnosis with many additionally needing magnetic resonance imaging (MRI) [16]. In stage 2–3 disease follow up CT can be six monthly for up to five years, but in stage 4 disease radiological monitoring can be for many years longer. Positron emission tomography (PET) scans are now commonly performed at diagnosis with many patients needing fluorodeoxyglucose (FDG-PET) as well as Dotatate-PET to establish “radiological grading” [57].

Table 1 Dotatate-PET scans are costed at £1800 in one London centre (Nuclear Medicine Department, King’s College Hospital- personal communication) with an average stage 4 disease patient requiring one or two scans

1.1 An overview of healthcare costs in England (Table 1) [45]

Somatostatin analogue costs range between £800–£1000 per month (pm). Targeted therapies such as everolimus or sunitinib cost £2000–3000 pm [45]. PRRT costs £40–50,000 per patient with many patients now having multiple cycles to control disease burden (Kings Health Partners Business Intelligence unit, personal communication). Chemotherapy costs are mostly due to high-cost drugs, with chemotherapy unit attendance contributing a much smaller part [45].

The most frequently used chemotherapy regimens are Temozolamide with Capecitabine oral combination therapy or single agent platinum-based intravenous treatment. In the treatment of grade 2 metastatic NEN, temozolamide is estimated to cost £1176 per 5 day cycle and capecitabine £120 per 14 day cycle [43]. For grade 3 NEN, a platinum-based intravenous regimen would cost £76.18 per cycle for carboplatin and £30.89 for etoposide [44], in addition to the attendance cost estimated at £130 per day on a chemotherapy unit [45].

Reimbursed costs for surgery and interventional radiology treatments in England (Table 2) [45] HRG = Healthcare Resource Group

Table 2 National Health Service healthcare resource group (HRG) codes and tariff (in GB Pounds) covering the whole cost of each admission for the procedure

Health economic evaluations (HEEs) are increasingly adopted by the UK government to assess treatments [49]. HEEs have become an integral part of rational decision making in health care and form a key part of health technology assessments (HTAs). NICE requires for all new pharmaceuticals to have HEE and will often commission HEEs where there is an absence of existing cost-effectiveness evidence. HEEs appear to have a major influence in NICE’s decision process; one research paper demonstrated the economic calculation could predict 82% of NICE decisions [11].

In HEE the following questions are posed: firstly whether a treatment is worth doing compared with other things we could do with the same resources, and secondly if these resources should be spent in this way and not on something else [14] .

1.2 Types of health economic evaluation (Table 3)

Table 3 The ICER or incremental cost-effectiveness ratio is the amount of extra cost which will be incurred for a unit gain of health benefit

2 Aim

The aim of the paper was to systematically review and critically evaluate standalone English-language literature on health economics in NEN over the last decade. Previous reviews [9, 21] found limited numbers of health economic evaluations. Novel therapeutics also justify a focus on more recent evaluations.

3 Methods

A literature search was performed including EMBASE, Cochrane library, Database of Abstracts of Reviews of Effects (DARE), NHS Economic Evaluation Database and the Health Technology Assessment (HTA) Database. The last three were searched using the Centre for Reviews and Dissemination (CRD) Database. A Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram was generated. (Fig. 1).

Fig. 1
figure 1

PRISMA flow chart of search strategy

Articles were searched by citation and abstract. Papers published from 2010 to October 2019 in English were included. We looked at treatment search terms, taking into account terms used in previous reviews by Chau et al. [9] and Grande et al. [21] (Table 4). See Appendix 1 for full search strategy terms. Further literature, including grey literature, was identified using a Google scholar search using the same terms. Citation searching was performed to ensure completeness.

Table 4 Search terms

The following were excluded: papers which were not economic evaluations or reviews of economic evaluations, papers which did not relate to NEN, papers not in English and budget impact studies. Cost of illness studies were excluded since limiting analysis strictly to HEEs was considered optimal.

Initial screening for excluding the above categories was conducted by a researcher. Subsequently, each paper was reviewed by three experts. The reviewers agreed on the final group of articles to be included in the review. If reviewers disagreed, a majority decision was taken.

Data on the methods and results of studies were abstracted into a standardised table by a researcher (BW) and checked by an experienced health economist (TS). Studies were critically appraised by TS using the CHEC list for trial-based economic evaluations and the Philips checklist for model-based economic evaluations [15, 47]. Results of data extraction were analysed by all authors.

4 Results

EMBASE, Cochrane and CRD search yielded 1388 articles. Google Scholar and citation tracking combined gave 145 results, making a total of 1533 articles. 22 articles were duplicates leaving 1511. 1470 articles were excluded. 41 articles were retrieved of which 29 were excluded making 12 total. There were 6 papers and 6 abstracts included.

Table 5a and b display the country of origin and funding source for the studies. Table 6 displays the study characteristics and Table 7 study results.

Table 5 Countries of origin and funding sources
Table 6 Study characteristics
Table 7 Study results

The Philips checklist [47] was applied to all economic evaluations (Appendix 2). All studies clearly stated their decision problem and most studies clearly reported the scope and perspective of their study, including the time horizon. Studies generally did not present evidence for the structures of their models, although most studies adopted similar structures (health states for pre-progression, post-progression and death), which likely arises as a result of the prominence of progression-free survival as a key endpoint in RCTs.

A number of studies excluded relevant comparators without justification (e.g., studies evaluating sunitinib but not everolimus). Most studies did not describe their methods for identifying data to inform model inputs and it was often unclear how relevant the utilities or QALY weights incorporated within models were. Exploration of uncertainty was sporadic, with parameter uncertainty most likely to be explored (through one-way sensitivity analyses and probabilistic sensitivity analysis), and most studies did not report internal and external validation.

Results from economic evaluations are grouped by comparison of treatments. Of twelve economic evaluations, eleven considered exclusively pharmacological treatment (three studies of SSAs, seven studies of sunitinib, everolimus and/or 177Lu-DOTATATE, one study of telotristat ethyl) and one compared surgery with intraarterial therapy. Seven studies of pharmacological treatment had placebo or best supportive care as the only comparator.

5 Somatostatin analogues (SSAs)

Marty et al. [37] developed a decision tree model to perform a cost-minimisation analysis of lanreotide (extended release formulation; trade name Somatuline Autogel® or Somatuline Depot®) versus octreotide (trade name Sandostatin LAR®). The measure of benefit was a successful injection (as there is a risk of clogging), and costs were estimated from French, German and UK healthcare payer perspectives and reported in 2010 Euros. The study found, through a combination of longer administration times and higher risk of clogging with octreotide, lanreotide to be cheaper per successful injection (France €34.90, Germany €91.10, UK €142.90). The study included only drug acquisition and administration costs, and did not include costs of adverse events or any measure of health benefit. The data source for administration time was a study in which nurses were timed performing injections into pads. The nurses were shown an instructional video on injection preparation and administration for lanreotide prior to administering lanreotide, whereas they were only provided with printed instructions for octreotide. Prior to this they were also given a demonstration and explanation of the features of the lanreotide pre-filled syringe and were asked to describe its most important characteristics.

Takemoto et al. [58] developed a three health state Markov model to assess the costs and consequences of octreotide versus best supportive care (BSC) in patients with metastatic midgut NET. The measure of benefit was progression-free survival. Costs were estimated from the private payer perspective and reported in Brazilian real (BRL). The primary source for effectiveness data was published data from the phase III PROMID trial [52]. Subjects remained on treatment until progression and resource use was estimated through published data and input from clinical experts. They state that octreotide is a clinically effective option to control tumour growth in patients with metastatic midgut NET. The authors state that since PROMID was not designed to evaluate OS they were unable to calculate life years gained.

Ray et al. [50] developed a decision tree model to assess the cost of treating unresectable, well-differentiated, advanced GEP-NET patients over a 6-month time horizon after they progress on octreotide. The stated basis is evidence of benefit to patients from switching to lanreotide after octreotide prior to modifying treatment class. There was no measure of health benefit (i.e., cost-minimisation analysis). Drug acquisition/administration, serious adverse event and patient management costs were considered. Patients could utilize octreotide escalation (30 mg every 3 weeks or 40 or 60 mg every 4 weeks), OCT plus PRRT, OCT plus liver-directed therapy, everolimus or lanreotide every 4 weeks. Costs were estimated from a US insurance payer perspective and reported in US dollars. Results were that lanreotide was found to be cost-saving versus alternatives when used post-octreotide. The authors state that clinical appropriateness must be considered when transitioning patients.

6 Everolimus, sunitinib and 177Lu-DOTATATE

Casciano et al. [7] developed a partitioned survival model to conduct a cost-utility analysis of sunitinib versus everolimus in patients with advanced, progressive pancreatic NET from a US payer perspective. The measure of benefit was QALYs and cost-effectiveness thresholds of $50,000 and $100,000/QALY were considered. Data from two RCTs (A6181111, RADIANT-3) were synthesised using the matching-adjusted indirect comparison method and using individual patient data from RADIANT-3[51, 65]. For PFS this was an anchored comparison (with placebo control as a common comparator), but for OS this was an unanchored comparison, because significant treatment switching was observed after disease progression in the control arms, leading to confounding of OS estimates. The study estimated that everolimus would improve life expectancy and QALYs at an additional cost, with an ICER of $41,702/QALY. The primary threat to the validity of this study is the use of an unanchored indirect comparison of OS from two studies, which requires assumptions which are acknowledged to be generally very difficult to justify [48].

Chua et al. [10] developed a partitioned survival model to conduct a cost-utility analysis of everolimus (with BSC) versus BSC alone in patients with advanced or metastatic NET of GI or lung origin from a Canadian healthcare payer perspective (although described by the authors as a societal perspective). The measure of benefit was QALYs and a cost-effectiveness threshold of CA$150,000 was considered. Intention to treat data from RADIANT-4 [66] informed PFS and OS estimates up to month 26, however after this point a proportional hazards assumption for OS was imposed. Health state utility values were estimated by mapping from FACT-G measured in RADIANT-4 to EQ-5D values. The study estimated that everolimus would improve life expectancy and QALYs at an additional cost, with an ICER of CA$145,670/QALY. The ICER was sensitive to the hazard ratio for long-term OS and to the time horizon, suggesting that the economic value is derived in significant part from extrapolation beyond the trial evidence.

Mujica-Mota et al. [41] developed a partitioned survival model to conduct a cost-utility analysis of sunitinib, everolimus, and 177Lu-DOTATATE versus BSC in patients with advanced unresectable or metastatic NET. The measure of benefit was QALYs and costs were included from an NHS and personal social services perspective. Four RCTs [51, 56, 65, 66] were used to estimate PFS and OS, however these RCTs had heterogeneous patient populations. Sunitinib could only be included when the population was limited to pancreatic NET, reflecting its licensed indication, and 177Lu-DOTATATE could only be included when the population was limited to gastrointestinal (midgut) NET. Ultimately three versions of the model were constructed: for pancreatic NET data from RADIANT-3 [65] and A6181111 [51] were combined; for GI (midgut) NET data from RADIANT-4 [65] and NETTER-1 [56] were combined; and for GI and lung NET only data from RADIANT-4 was included. The study found that everolimus was unlikely to be cost-effective in any of the settings at its list price, sunitinib was likely to be cost-effective in pancreatic NET, and 177Lu-DOTATATE was unlikely to be cost-effective in GI (midgut) NET. The main limitation of the study is that indirect comparison was required for evidence synthesis because all RCTs were placebo-controlled rather than being head-to-head RCTs of active treatments.

Soares et al. [54], Walczak et al. [61], Johns et al. [26] and Kansal et al. [28] developed models to conduct cost-utility and cost-effectiveness analyses of sunitinib plus BSC versus placebo plus BSC in advanced unresectable or metastatic pancreatic NET. All four studies were supported by Pfizer. The measure of benefit was life years (LY) in Soares et al. [54] but was QALYs in the other study abstracts. The cost-effectiveness threshold was not stated in Soares et al., but anecdotally Portugal adopts a threshold of €30,000/QALY [67]. The cost-effectiveness threshold was stated by Walczak et al. [61] to be 99,543 PLN/QALY in Poland, and was stated by Johns et al. [26] to be £50,000/QALY in the UK for end-of-life treatments. Kansal et al. [28] did not state a cost-effectiveness threshold. A single RCT (A6181111) was used as the source of evidence in all studies. In Soares et al. [54] and Johns et al. [26] OS was adjusted for treatment switching using the rank-preserving structural failure time (RPSFT) method.

Soares et al. [54] estimated a LY gain of 1.83 with an ICER of €24,035/LY. When using ITT analysis (instead of RPSFT) the ICER increased to €34,387/LY. Walczak et al. [61] did not report costs and benefits separately but reported an ICER of 84,214 PLN/QALY (€20,441/QALY). Johns et al. [26] estimated an ICER of £22,587/QALY. Kansal et al. [28] estimated an ICER of €52,401/QALY.

7 Telotristat ethyl

Joish et al. [27] developed a Markov model to conduct a cost-utility analysis of telostristat ethyl added to octreotide (TE + SSA) versus octreotide alone (SSA) in patients with carcinoid syndrome diarrhoea (CSD). Costs were included from a third-party US payer perspective and the measure of benefits was QALYs. The cost-effectiveness threshold was $150,000/QALY but thresholds of $300,000 and $450,000/QALY were also argued to be relevant as CSD was argued to be an ultra-orphan condition. The model included states for adequate control and inadequate control, as well as death. Patients in the SSA arm could transition from adequate control to inadequate control but patients in the TE + SSA arm could not. The key source of effectiveness data was a multi-country RCT [29]. Utilities were estimated from a vignette study of ulcerative colitis patients. A biomarker, u5-HIAA, was used as a surrogate outcome which was assumed to mediate a mortality benefit. The study found that TE + SSA increased QALYs by 0.66 at an additional cost of $94,962 (ICER $142,545/QALY). This was considered cost-effective at the thresholds considered but these thresholds are higher than typically used in US economic evaluations ($50,000 and $100,000/QALY).

The most significant threats to validity of results generated from in this case was the assumption that ulcerative colitis health state utility values elicited using vignettes are an appropriate proxy for CSD health states and the surrogate assumption that the observed association between u5-HIAA and mortality is a suitable basis for estimating the effect on mortality of an intervention from its effect on u5-HIAA. It would instead seem more appropriate to take existing measures of health-related quality of life in this disease cohort, e.g., EORTC QLQ-C30 in the TELESTAR study [29], and using a utility mapping algorithm [64] to estimate preference-based utility values. This would narrow the gap in utility between adequate and inadequate control patients.

8 Surgery and intra-arterial therapy

Spolverato et al. [55] developed a Markov model to conduct a cost-utility analysis of hepatic resection (HR) versus intraarterial therapy (IAT) for patients with NELM. Costs were included from a US health care provider’s perspective and the measure of benefits was QALYs. The cost-effectiveness threshold was $50,000 per QALY. The only health events incorporated in the model were retreatment and death. The key sources of effectiveness data was a retrospective cohort study [38] of individuals undergoing treatment for NELM by HR or IAT over a 25 year period at one of nine US institutions. In the base case (57-year-old man with metachronous symptomatic NELM involving <25% of liver and no extrahepatic disease) HR was cost-effective versus IAT, however other cases were identified where HR was not cost-effective, e.g., when asymptomatic but with hepatic involvement ≥25%. There was no accounting for confounding in the primary evidence source (since IAT recipients were more likely to have greater hepatic involvement and presence of extrahepatic metastases), so the economic evaluation likely overestimates the survival benefit from HR versus IAT and its cost-effectiveness.

9 Discussion

It was stated in the review by Chau et al. [9] that “Although the published literature in the area of NET is substantial, there is a lack of treatment-specific and comparative economic and outcomes research data associated with commonly used treatments”. In a subsequent review in 2019 Grande et al. [21] suggested “further economic evaluations are required to inform healthcare decision-making”. Our review demonstrates health economic literature in NEN which fulfils quality criteria for HEEs is still scarce.

Despite there being more HEE in the literature than in the previous decade there remains a paucity of economic evaluations, the majority of which are partly or wholly industry funded. The lack of cost data collected from relevant patient population remains the main weakness of existing evidence from HEEs. Another limitation relates to the lack of data on medium to long term survival outcomes and therefore the need to rely on clinical expert opinion for predicting those outcomes beyond one or two years after the start of targeted therapies, particularly in gastrointestinal NEN [41].

The long-term treatment of NEN is costly due in the most part to pharmaceuticals. In one cost analysis, the long term follow-up of NEN was significantly more costly when compared to colon cancer. The authors demonstrated that almost all of this increased cost was due to maintaining drug treatments such as SSAs [23]. Due to heterogeneity of treatment pathways in NEN it is difficult to calculate accurate continuing costs for the whole cohort. We would expect further literature to emerge on the cost-effectiveness of lutetium treatment.

A problem frequently observed in RCTs of anti-cancer therapies is that patients in the control arm are allowed to switch to the study drug following disease progression [31]. This leads to confounding in post-progression endpoints, for example overall survival, and renders estimates of these endpoints unsuitable for inclusion in economic models without some form of adjustment. The rank-preserving structural failure time (RPSFT) method was used in three included studies [26, 41, 54] to adjust for this confounding, but this method typically assumes that the treatment effect received by switchers must be the same as the treatment effect in those initially randomised to the study drug, which is unlikely to be true when the main cause for switching is disease progression. Alternative methods for adjusting for treatment switching (such as the inverse probability of censoring weights, IPCW, method), as well as intention-to-treat analyses should be conducted and presented. Soares et al. [54] have demonstrated that these methodological choices can have a very significant impact on cost-effectiveness estimates and therefore lead to decision uncertainty.

Another problem encountered when comparing multiple novel therapies is that there are frequently no head-to-head comparisons of the treatments in randomised controlled trials. Where trials have shared a common comparator (often placebo) indirect comparisons and multiple treatment comparisons [13] can be appropriate under certain assumptions. One of the assumptions is that treatment effect modifiers (patient characteristics which affect the relative effectiveness of one or more treatments, as opposed to simply being prognostic) are distributed equally across studies. This assumption can be relaxed by using population-adjusted indirect comparisons when there is access to individual patient data (IPD) in one of the studies in the indirect comparison [48]. One such approach (matching adjusted indirect comparison, MAIC) weights patients in the trials where IPD are available to match the patient characteristics in another trial before conducting an indirect comparison.

If the aim is to compare across conditions, health-related quality of life in a clinical setting can be measured using a generic preference-based multi-attribute utility instrument, such as the EQ-5D, and should be valued using an appropriate tariff (e.g., from a time trade-off or standard gamble study). However disease-specific instruments are often more sensitive and responsive for certain health states [63]. Areas where inaccuracy was introduced in the economic evaluations reviewed included making certain assumptions without justification [55], using other disease areas as proxies [55], use of vignette studies which value hypothetical disease states [27] and referencing unpublished data [27].

Little is known about preference based values associated with health related quality of life outcomes of targeted therapies after disease progression. Almost all evidence originates from randomised clinical trials which measure these outcomes at fixed time points driven by dosing schedules, which vary across treatments, typically with high drop-out rates. This suggests longitudinal preference based assessment of quality of life outcomes of NEN, based on representative patient cohorts, is needed. This evidence would help distinguish patient preferences and relative effectiveness between targeted therapies in advanced or metastatic pancreatic NETs which have similar clinical outcomes but different safety profiles [41].

All the included studies were model-based economic evaluations rather than being economic evaluations alongside clinical trials (EEACT). EEACT include direct measurements of healthcare resource use, for example through resource use questionnaires, claims and other databases and also measurements of effectiveness, e.g. preference-based health-related quality of life. Model-based economic evaluations can accurately estimate costs in the delivery setting (although this is dependent on having high quality data sources), but are less likely to accurately estimate any knock-on effects on costs [20]. In a single centre study the data collection for an EEACT is typically not prohibitive, but for multi-centre (and particularly multinational) trials, data may be more challenging to collect.

Where studies do not incorporate quality of life (i.e., by using LYs instead of QALYs), decision-makers should be aware that incremental QALYs can be considerably lower than incremental LYs in economic evaluations of cancer treatments unless substantial improvements to quality of life are realised. This means ICERs with QALYs as denominators can be much greater than ICERs with LYs as denominator. Soares et al. [54] estimated an ICER of €24,000/LY, however the best estimate of a cost-effectiveness threshold for Portugal is €30,000/QALY. Assuming a fairly typical utility value of 0.7 this would lead to an estimated ICER of around €34,000/QALY.

The limitations of the study mainly relate to the limitations of our systematic review and also to the available evidence. The search terms may not capture all available literature however to attempt to remedy this citation searching and ‘grey’ literature searches using google scholar were performed. We did not look for articles outside of the English language which meant at least one relevant economic evaluation in NEN was not included which has appeared in a previous review [40]. Although EMBASE and other related databases were included in the search, we did not have access to the DIMDI Superbase, which was used in the Chau et al. [9] review. We did not perform a formal risk of bias quality assessment, rank evidence according to grade or examine publication bias.

Limited data on costs hampers adoption decisions regarding targeted treatments. A salient example is the uncertainty in the value for money of PRRT in progressive or metastatic NET, for which there is strong evidence of clinical benefits in terms of progression free survival, but no data on impact on healthcare resource utilisation and costs [41]. Further research could aim to enrich the evidence base on resource use and costs of NEN using electronic medical records and registry data. Studies using models such as that of Laudicella et al. [32] may generate the required generalizable evidence to inform timely policy decision making.

We recommend that investigators for future trials in NEN should ensure that key endpoints for economic evaluations (progression-free survival and overall survival) are not confounded by crossover (treatment switching), or that if crossover is allowed then studies should make this clear beforehand, and collect all necessary data to support methods for adjusting for crossover with different underlying assumptions (e.g., IPCW and RPSFT or iterative parameter estimation methods). The results of each of these different methods should be presented alongside intention-to-treat analyses. Investigators should release anonymised patient-level data to support more reliable syntheses of studies and ideally collect preference-based health-related quality of life measures (e.g., EQ-5D) on a schedule which minimises bias (for example, measurements should not be taken only prior to drug administration). We also suggest investigators conducting economic evaluations should release their modelling as open data to maximise transparency and further research in the field.

10 Conclusion

Overall we conclude that although there has been progress since 2013 [9], there are still only a small number of high quality independent economic evaluation studies in NEN. Most HEEs do not meet published health economic criteria used to assess quality. Clinicians should be cautious when interpreting economic evaluations of high-cost treatments of NEN given the complexities associated with comparisons across heterogeneous trials with confounding of relevant outcomes. Further research with high-quality effectiveness data and rigorous applied health economic analysis is needed.