Literature Search and Expert Interview Results
A total of 549 records were identified from the searches, with an additional seven identified by the authors (Fig. 1). In order to restrict the number of records to within our practical limit of 300, articles published before 2011 were not considered for screening. It was assumed that any methodological issues identified in earlier literature would have either been resolved or reiterated in later articles. Screening on the basis of title, abstract and, where necessary, full texts left 31 included papers.
A total of 13 (65%) experts consented to be interviewed. Four represented the scientific affairs, technology appraisal, clinical guidelines and diagnostics assessment programmes at NICE, and expressed their personal views, rather than NICE policy. Other interviewees included four senior health economists, two researchers in digital health, a representative of the Medical Research Council (MRC), a specialist from the Precision Medicine Catapult institute, and professors of health informatics and primary care sciences. The interviews had an average length of approximately 50 min.
Defining Precision Medicine
A preliminary consideration for this study was to define the types of technologies and services that precision medicine encompasses. Ten papers from the review provided a definition for precision medicine, as did each of the consulted experts, resulting in a wide range of interpretations [10,11,12]. Most agreed that precision medicine encompasses more than just pharmacogenetic and pharmacogenomic tests, and the term is now used interchangeably with stratified medicine. It is also replacing the term personalised medicine, as it also covers technologies that offer unique treatment pathways for individual patients [10, 11].
For the purposes of this study, we consider that a tool falls under the precision medicine ‘umbrella’ if it can be used to stratify patients to a specific treatment pathway or therapy, based on specific characteristics of the individual. These characteristics vary by tool but go beyond demographic or socioeconomic factors, and include genomic (or other ‘omic’) information, behavioural traits (including preferences), and environmental and physiological characteristics. Furthermore, tools will usually provide information on disease risk, diagnosis, prognosis or treatment response. This definition is summarised in Fig. 2.
Interviewees stressed an additional and important distinction between prognostic and predictive tests. Prognostic tests indicate the likelihood that an individual patient will have a particular disease course or natural history. For example, the Decipher® (GenomeDx Biosciences Laboratory in San Diego, CA, USA) prostate cancer test, which calculates the probability of metastasis . Predictive tests provide an estimate of the expected disease response to specific treatments, such as tests identifying the human epidermal growth factor receptor (HER2) gene to determine treatment allocation for patients with breast cancer. This distinction has direct implications for HTA: a senior health economist highlighted a recent instance in NICE Diagnostics Guidance in which the committee’s discussions focused on whether the technology could be considered predictive as well as prognostic, since this had an impact on the cost effectiveness of the test .
Three major types of precision medicine technology likely to emerge over the next decade were identified: complex algorithms, digital health applications (‘health apps’) and ‘omics’-based tests. These are summarised in the following sections and, alongside existing precision medicine tools, in Table 1.
The experts anticipated increased use of algorithms that use artificial intelligence (AI) to aid clinical decision-making over the next decade . These algorithms require large datasets (‘knowledge bases’) that include a large number of variables, such as genetic information, sociodemographic characteristics and electronic health records. Using this information, the algorithms provide clinicians and patients with predictions on expected prognosis and optimal treatment choices using patient-level characteristics. Algorithms update regularly as new information is added to the knowledge base, an approach termed ‘evolutionary testing’. The first approaches of this type for clinical use are already being established [16,17,18,19,20]. AI-based technologies will also be combined with advances in imaging to develop algorithms that incorporate scan results into knowledge bases to offer more accurate information .
Health apps include a wide range of tools that provide disease management advice, receive and process patient-inputted data, and record physical activity and physiological data such as heart rate. A subset of apps will likely fall under precision medicine, with the most advanced also utilising AI-based technology as described in Sect. 3.3.1. Numbers of health apps are expected to increase significantly over the next decade. Digital health experts predicted that principal developments in this area would involve apps that analyse social or lifestyle determinants of health such as socioeconomic status or physical activity in order to stratify patients, including apps linked to activity monitoring devices (or wearable technologies). In November 2017 NICE published briefings on mobile technology health apps that were developed by the NICE medical technologies evaluation programme as a proof-of-concept activity, known as ‘Health App Briefings’. One of the first to be published concerned Sleepio, an app shown in placebo-controlled clinical trials to improve sleep through a virtual course of cognitive behavioural therapy .
Many current precision medicine tools use genetic and genomic information to estimate disease prognosis and predict treatment response . A senior health economist predicted the use of other ‘omics’-based biomarkers, such as proteomics, metabolomics and lipidomics would become more common and partially replace genomics over the next decade.Footnote 1
‘Omics’-based testing is expected to increase in complexity and scope, with single tests informing treatment pathway, therapy choice or disease risk for multiple diseases simultaneously . This was described by one expert as “multi-parametric testing”. Whole-genome sequencing is at the broadest end of this scale and could feasibly provide information on risks and treatment decisions for hundreds of diseases .
Issues for Health Technology Assessment
Precision medicine interventions will pose challenges at each stage of the HTA process, from scoping through to review (Fig. 3).
The nature of the decision problem presented to HTA agencies and guideline developers will become more difficult to define when dealing with some precision medicine technologies and services. The emergence of multi-parametric tests, for instance, is expected to increase the number of relevant interventions, comparators and populations encompassed by a single assessment by providing information on multiple diseases simultaneously. The number of care pathways under consideration will also increase because tests may (i) not have a defined place in the care pathway and could potentially be used at a range of timepoints; and (ii) be used in combination with other tests [33,34,35,36,37]. Evaluating all of the relevant pathways, populations and comparators could be practically and computationally infeasible, and will likely necessitate increased use of expert opinion [11, 33,34,35, 38, 39]. One expert noted that these issues are particularly relevant for whole-genome sequencing, which can be performed at any point during an individual’s lifetime, inform care pathways for a wide range of diseases, and be analysed using many different methods .
The fast pace of innovation in precision medicine may also mean that assessment bodies face higher volumes of evaluations. Mixed views on how to address this emerged from expert interviews. A NICE analyst stated that scoping workshops, in which clinicians and other consultees determine which technologies should be evaluated, may be sufficient for technology appraisal. With respect to health apps, researchers agreed that new systems would need to be put in place to manage the burden of assessment. This could involve (i) a preliminary self-assessment phase; (ii) appraising classes of (rather than individual) apps; or (iii) setting priority areas using clinician input. Each present their own difficulties: classes would need to contain apps that are relatively homogenous, whilst any priority-setting process would require a clear and transparent decision-making framework.
Experts highlighted that adaptive AI-based algorithms would present a unique challenge in terms of regulation and evaluation. As more data are processed and the algorithm becomes more effective over time, evaluators would need to decide how frequently and exactly when to assess safety and clinical and cost effectiveness . Interviewees also highlighted that technical validation of complex algorithms could be a challenge .
A number of studies stated that the value placed on knowing diagnostic test results may need to be included in economic evaluations of precision medicine [35, 39, 42,43,44,45,46,47,48,49,50]. This could be positive if such knowledge benefits patients and their families: directly in the case of hereditary conditions , or indirectly through enhanced autonomy or changes in lifestyle and screening behaviours . Conversely, unintentional harms may also occur, for example due to psychological stress for patients and families.
Experts highlighted that the health-related quality of life instruments typically used in economic evaluations are unlikely to capture this value of knowing and that decision-makers may instead consider these factors through deliberation, taking into account the patient perspective, when making recommendations. Three studies [12, 35, 44] suggested that discrete choice experiments could be used to value patient preferences for increased knowledge, over and above any specific quality-adjusted life-year (QALY) gains deriving from subsequent treatment decisions. Quantifying these benefits separately (or in monetary terms) would be consistent with a welfarist framework but not the extra-welfarist one adopted by some agencies such as NICE . Furthermore, incorporating these additional aspects of value on the benefits side of the cost-effectiveness equation also requires that they be incorporated when accounting for opportunity costs . Incorporating non-health benefits into the evaluative framework of HTA would therefore require knowledge of (i) the extent to which society is willing to trade-off health and non-health benefits and (ii) what type of services might be displaced in order to fund a new intervention, and their associated non-health benefits.
Evidence Evaluation and Synthesis
Precision medicine presents numerous challenges for evidence evaluation. The stratification of patients to increasingly small subgroups will reduce sample sizes [10, 44, 53] and result in only certain subgroups (i.e. ones with specific biomarkers) being included in individual trials. Obtaining head-to-head estimates of comparative effectiveness for treatments and subgroups will become more difficult and will result in evidence networks being incomplete in cases where no common comparator links together the available trials. One study and several of the interviewees concluded that expert opinion will be needed more regularly to fill gaps in the evidence , along with suitably robust methods for eliciting these judgements . Interviewees also noted that new trial designs are being developed that may be more compatible with precision medicine, including basket, umbrella and adaptive trials [54,55,56]. These designs, which are yet to contribute to any value dossiers submitted to HTA agencies, allow for trials to be adapted in terms of inclusion criteria and treatment response.
Nevertheless, the need to analyse multiple subgroups and more complex treatment pathways in decision models for precision medicine interventions is likely to necessitate additional sources of evidence  in terms of both cost and clinical data [33,34,35,36, 38, 39, 53]. An absence of relevant data recently resulted in the discontinuation of a diagnostic service delivery guideline being developed by NICE . Regulatory efforts are being made to encourage the generation of clinical evidence, including the introduction of the In Vitro Diagnostic Device Regulations (IDVR) in 2017 by the European Commission . However, as the new clinical evidence requirements for approval of the IDVR will not apply until 2022, evidence paucity is likely to be an issue in Europe in the medium term.
There was consensus that use of observational data for assessing precision medicine interventions will increase over the next decade [11, 33, 36, 39, 44], including registry data, cohort studies and electronic health records [16, 59, 60]. Experts noted that advanced statistical methods (and accompanying technical guidance) would be required to identify causality while controlling for the risks of selection bias and confounding in observational data.
Multiple studies predicted that the complexity of clinical pathways in precision medicine could render traditional Markov-type model structures insufficient for capturing long-term costs and benefits [10, 12, 35, 42, 43, 61]. For example, multi-parametric testing may lead to secondary findings unrelated to the original test, as well as spill-over effects on family members and future generations . A number of studies concluded that more research is needed to establish best practice guidelines for modelling precision medicines [12, 33, 43, 60], while others suggested approaches that could handle complex structures more adequately, such as microsimulation and discrete event simulations [12, 35, 62].
The stratification of a patient population may result in smaller sample sizes being recruited to trials for precision medicine interventions. Combined with more complex and variable treatment pathways, this could increase levels of uncertainty associated with cost-effectiveness estimates presented to decision makers.
Higher standard errors for estimates of treatment effect were raised as a concern [11, 35, 36, 42, 43, 46, 60, 61]. Several experts believed, however, that this concern is overstated. First, treatment effect variation between patients should be lower when therapies are targeted towards responders, thereby reducing standard errors. Second, any reduction in sample sizes could be compensated for in time through the use of large, linked observational datasets . Value of information analysis, a technique for quantifying the value of reducing decision uncertainty, was also identified as key technique that could be beneficial to decision-making [12, 33, 44, 60, 63, 64]. Along with more typical factors such as patient population size, the key determinants of value of information in precision medicine will include the sensitivity and specificity of tests and predictions, and the intervention context (i.e. if it is used in combination with other tests).
Another source of uncertainty will be the unit costs, for example of ‘omics’-based tests, which vary by laboratory . Such tests may also yield continuous results, meaning that thresholds must be set to determine the outcome of testing . Thresholds will impact on the cost effectiveness of tests and, therefore, it was argued that determination of thresholds should go beyond analysis of receiver operating characteristic curves .
Complex clinical pathways will generate substantial uncertainties over model structure in economic evaluations of precision medicine interventions. Many experts and studies highlighted this as a critical aspect of decision modelling that would need to be addressed [11, 33, 35, 36, 39, 43]. Whilst it was agreed that the current approach of extensive sensitivity and scenario analyses should continue, interviewees expressed a desire for coherent frameworks for analysing and quantifying structural uncertainties. Approaches highlighted in the literature included multi-parameter evidence synthesis, although this approach may also be impeded by sparse data . Value of information-type approaches can help to quantify the extent of this uncertainty and the value of reducing it, through techniques such as expert elicitation .
An additional consideration is uncertainty around the behaviour of clinicians and patients. Decisions made by these individuals, for example whether to follow the treatment pathway indicated by the result of a diagnostic test, could influence how clinically effective the intervention is and, thus, impact cost effectiveness [12, 33, 34, 38, 39, 43, 44]. In terms of clinician behaviour, low compliance to genotype-specific dosing recommendations has been observed . Steep learning curves for some stratification tools have also been suggested as a cause of variability . On the patient side, adherence remains an important yet under-researched determinant of effectiveness . The development and application of evidence-based computerised decision support and patient decision aids could be a way to tackle these challenges.
Equity and Equality
When generating guidance that recommends different courses of treatment for different groups of patients, HTA agencies and other public bodies should aim to ensure that principles of non-discrimination and equality of opportunity are advanced [68, 69]. The main challenge lies in the specific instances where there are small numbers of patients in rare biomarker-stratified groups, for whom there is greater uncertainty around treatment effects . An equality issue arises when the biomarkers used for stratification are correlated with factors such as ethnicity [36, 53]. In the NICE appraisal of sofosbuvir for treating chronic hepatitis C , low levels of evidence were available for some genotypes that were more common in minority ethnic patients. In this instance, a ‘pragmatic’ approach was explicitly taken on the grounds of equity, high unmet need and the lack of treatment options; evidence was extrapolated from genotypes where the treatment’s effectiveness was well-supported and the therapy was recommended for the rarer genotypes.
Stratifying patients to different treatment pathways based on measures of physiological dysregulation (such as blood pressure or cortisol level) may also introduce equity concerns. A significant, negative association between these measures and socioeconomic status has been established in the literature ; differential treatment recommendations may therefore result in individuals from low socioeconomic groups having a lower probability of receiving the most effective treatments. Concerns were also raised with respect to the differential uptake of some precision medicine interventions that require patient engagement. This is particularly true in digital health, where experts reported that use of health apps was much more common in younger age groups and those with higher social and educational status. If traditional (i.e. general practitioner-delivered) services were to be withdrawn in favour of digital-only access, the benefits of precision medicine may be unevenly distributed.
Experts working for HTA agencies noted that the rate of discovery of biomarkers means that the specificity and sensitivity of companion diagnostic tests is expected to steadily improve. Similarly, health apps and AI-based algorithms are regularly updated and upgraded, meaning that certain treatment pathways might become more cost effective over time. Although beneficial, this could reduce the ‘shelf life’ of guidance issued by HTA agencies and necessitate more frequent reviews and updates . NICE have already begun addressing this issue with innovations to fast-track some evaluations  and increase the capacity of the technology appraisals programme . Similar combined approaches to streamlining processes and increasing capacity will help the HTA community keep guidance up-to-date and useful while keeping the overall cost of HTA manageable.