Introduction

Adverse drug reactions (ADRs) are often responsible of morbidity and mortality [1]. In the USA, it has been estimated that 106,000 deaths per year are caused by ADRs [2]. In Germany, the incidence of ADR-induced hospitalizations amounts to approximately 3.25% of overall hospitalizations, and the overall ADR treatment costs sum to €434 million per year [3]. The field of pharmacogenomics or pharmacogenetics (PG), these terms are sometimes used interchangeably [4, 5], may be a solution to reduce ADRs [6]. PG constitutes a core area of personalized medicine. The growing knowledge of genetics/genomics, and particularly the increasing understanding of the genotype–phenotype interaction, forms the basis for this personalized approach. The progress in genetic technology, characterized by faster and cheaper analytical tools, is an essential driver for personalized interventions.

Genetic analyses are the central tools in the new area of personalized medicine (often also termed stratified medicine) [7, 8]. Stratified medicine aims at classifying patients into subgroups according to genetically determined features [9]. For example, patients may be divided into groups based on the known influence of genetic parameters on drug dosage and side effects [10]. Therefore, PG uses information about a person’s genetic makeup to choose the best drug as well as the medication dosage for a particular patient [11]. The concept of stratified medicine also includes screening, preventive, or therapeutic measures for a specific subgroup of a patient population [12].

Pharmacogenetic tests (PTs) can be used to characterize individual patient features at the molecular, genetic, and cellular levels [13, 14]. PT primarily focuses on identifying specific biomarkers or genetic mutations. Generally, biomarkers can provide information for diagnostic, prognostic, and predictive purposes. In a diagnostic context (especially in an oncologic setting), biomarkers are used to identify a disease or the stage of the disease [15]. The assessment of a patient’s overall outcome (e.g., the probability of cancer recurrence after standard treatments) can be provided by prognostic biomarkers [16]. Furthermore, in a predictive context, biomarkers are used as an efficacy test before drug administration. This test serves the purpose of assessing the likelihood of a positive response after a potential treatment. In this context, predictive biomarkers can help to optimize drug selection, dose, and treatment duration as well as prevent ADRs [17].

The presence of genetic mutations or deletions can also be used for predictive purposes. Several studies have demonstrated that previously identified genetic mutations, such as those on the epidermal growth factor receptor (EGFR), Kirsten RAS (KRAS), and the breast cancer susceptibility gene I and II (BRCA I, BRCA II), predict resistance to treatment [18, 19]. For example, an identified EGFR gene mutation or an increased EGFR gene copy number is associated with a positive response to epidermal growth factor receptor tyrosine kinase inhibitors (EGFR-TKI) in non-small cell lung cancer (NSCLC) [20]. On the other hand, a KRAS mutation is an important predictor for resistance to an EGFR-TKI therapy [21]. Moreover, gene mutations can also provide information for optimal drug dosage. For instance, the dosage of azathioprine (AZA) is based on the thiopurine-methyltransferase (TMPT) genotype or activity. Patients with no TMPT activity (TMPT deficient) receive no or a reduced dose of AZA, whereas the dosage of AZA administered in patients with an active TMPT differs [12, 22].

The outdated concept of “one size fits all” should be replaced by stratification and move towards a patient-oriented drug treatment [23]. However, this concept is equally connected to hopes and concerns. Potential advantages of target therapies include increasing clinical effectiveness, e.g., by improving survival [24], and improving patient safety [25]. On the other hand, there are concerns regarding the increased costs of diagnostic tests [26].

However, in recent years, an increasing number of pharmacogenomics applications have been observed [27]. Currently, 47 drugs for pharmacogenetic therapy are approved in Germany. A genetic diagnostic test prior to drug administration is required for 39 of these drugs and recommended for eight [28]. An overview of pharmacogenetic therapies is provided in supplementary file 1. The sustainability of the current trend for stratified pharmacotherapies depends on the cost-effectiveness (CE) of the treatment. The incremental cost-effectiveness ratio (ICER) is a tool to assess the CE of new interventions and is defined as the ratio of the additional costs (e.g., of a new stratified therapy vs. the standard therapy) divided by the additional benefits of the new stratified therapy vs. the standard therapy. The ICER also indicates the cost per additional benefit [e.g., life-years gained (LYG) or quality-adjusted life years gained (QALY)]. Such economic analyses are necessary for identifying therapies with the greatest health benefits at acceptable costs, as well as for the development of guidelines for an optimal and efficient treatment. The use of PTs depends on their impact on the CE of targeted therapies. As a result of the limited resources in the healthcare system and the sometimes substantial costs for active ingredients, it is important to evaluate the CEs of PT-guided targeted therapies.

For this purpose, we conducted a systematic literature review to analyze the CE of stratified pharmaceutical therapies. The review has two objectives:

  1. 1.

    Analyze and assess the CE of PT-guided treatments in published health-economic evaluation studies.

  2. 2.

    Highlight the differences and methodological characteristics of the included studies, which may influence the CE of stratified therapies.

Methods

First, PICO elements (population–intervention–comparator–outcome) were defined in order to focus the scientific issue and facilitate the literature search (Table 1).

Table 1 Review objective and PICO elements

In November 2015, a systematic literature search was conducted using the meta-database of the German Institute for Medical Documentation and Information (DIMDI) in the following databases: ABDA, AMIS, BIOSIS Previews, Cochrane Central Register of Controlled trials, Cochrane Databases of Systematic Reviews, DAHTA-Datenbank, Database of Abstracts of Reviews of Effects, EMBASE, EMBASE Alert, ETHMED, GLOBAL Health, gms, Health Technology Assessment Database, Medline, NHS, and SciSearch. The search strategy combines economic individualized medicine-related terms with the names of active ingredients. At the time of this research, there were 42 active ingredients approved for personalized medicine in the German market [28]. The following search strategy, using combined search terms (English and German), was applied: (1) [Abacavir OR Afatinib OR Anastrozole OR Arsentrioxid OR Ataluren OR Azathioprine OR Bosutinib OR Brentuximab vedotin OR Carbamazepine OR Cetuximab OR Crizotinib OR Ceritinib OR Dabrafenib OR Dasatinib OR Eliglustat OR Erlotinib OR Everolimus OR Exemestane OR Fulvestrant OR Gefitinib OR Ibrutinib OR Imatinib OR Ivacaftor OR Lapatinib OR Letrozole OR Lomitapide OR Maraviroc OR Mercaptopurine OR Natalizumab OR Nilotinib OR Olaparib OR Oxcarbazepine OR Panitumumab OR Pertuzumab OR Ponatinib OR Tamoxifen OR Toremifene OR Trametinib OR Trastuzumab OR Trastuzumab emtansine OR Vandetanib OR Vemurafenib] AND (2) [Biomarker OR individuali* OR personali* OR stratif* OR Subgruppe* OR subgroup* OR pharmakogen* OR pharmacogen* OR Test* OR profiling] AND (3) [Nutzen OR benefit OR Nutzwert OR utility OR Effektivität OR effectiveness OR effizien* OR efficien*] AND (4) [Kosten* OR cost* OR technology assessment]. The operator “AND” combined the search terms while an asterisk was used as a truncation for a greater search coverage. Additionally, a search was conducted by hand. Assessment of titles and abstracts was performed independently by two researchers. Only original studies published in full text were included. Full papers were assessed by two researchers, and disagreements were resolved through discussion. Figure 1 summarizes the search process.

Fig. 1
figure 1

Flow diagram of articles identified and evaluated on the basis of inclusion criteria

To ensure comparability, the results were converted to US dollars at the exchange rate of the year of publication [29, 30].

The published 100-point Quality of Health Economic Studies (QHES) instrument was used to evaluate the quality of the included studies (Table 2) [31]. The QHES evaluation was also conducted by two independent researches, and the disagreements were resolved through discussion.

Table 2 The Quality of Health Economic Studies (QHES) instrument

This evaluation consists of 16 items, each providing a score between one and nine. The overall evaluation, after summing the scores of each item, identified the quality of an article, which was categorized into four groups (Table 3). The evaluation of the article quality was also conducted by two independent experts.

Table 3 Classification of study quality

This article does not contain any new studies with human or animal subjects performed by any of the authors.

Results

The database search identified 1535 records. After removing 175 duplicates, the title and abstract of the remaining 1360 records were screened. Subsequently, 1238 records were excluded as they did not cover the objective of the study. The remaining 122 records were assessed for eligibility, and inclusion criteria were fulfilled by 27 studies, which were included in the final assessment (Fig. 1).

All studies are characterized by a variety of elements, such as country, perspective, treatment line, active ingredient, treatment strategy, biomarkers, consideration of test costs, consideration of sensitivity, and specificity of the test and funding source. A detailed overview is provided in supplementary material 2.

Quality Assessment (QHES)

The results of the quality assessment using the QHES instrument are presented in Table 4. An average value of 85.81 was calculated. Three studies [46, 47, 56] were assessed to have a fair quality, while all others achieved a high quality score. The objective of all studies was represented in a clear manner (QHES item 1), but seven did not state the perspective of the study (QHES item 2) [22, 33, 37, 40, 45, 53, 56]. In three studies, data were not extracted from the best available source (QHES item 3) [32, 48, 49]. Six studies used data from a subgroup analysis (QHES item 4) [32, 36, 37, 42, 52, 53]. The majority of studies, with the exception of one, handled uncertainties properly (QHES item 5) [56]. All studies, with the exception of five, performed an incremental analysis for costs and outcomes between the alternatives (QHES item 6) [38, 39, 47, 51, 56]. Detailed information for the methodology of data extraction was not reported in four studies (QHES item 7) [37, 46, 47, 56]. The majority of studies fulfilled the criteria of QHES items 8 and 9. Only four studies did not choose the appropriate time horizon or did not discount benefits and costs beyond 1 year (QHES item 8) [43, 46, 51, 55]. Furthermore, four studies failed to measure the costs appropriately or to describe methods for estimations of quantities and unit costs clearly (QHES item 9) [41, 46, 47, 56]. All studies clearly stated the primary outcome (QHES item 10). All studies, except for three, stated valid health outcomes or gave a justification for the measurement used if other more valid and reliable measures were not available (QHES item 11) [12, 47, 48]. In most of the studies, the economic model, methods, and analyses were displayed transparently, except in four (QHES item 12) [22, 39, 46, 52]. All studies gave a justification for the choice of limitations or assumptions (QHES item 13). The authors of seven studies discussed explicitly the direction and the magnitude of the potential bias (QHES item 14) [3941, 43, 45, 52, 54]. All studies provided proper conclusions or recommendations based on results (QHES item 15). Finally, only six studies did not disclose the source of funding (QHES item 16) [22, 39, 42, 43, 46, 56].

Table 4 Results of the QHES assessment

Main Characteristics of the Studies

All main characteristics of the studies are presented in Table 5. The included studies were published between 2002 and 2015. In the years 2000, 2001, and 2003 we did not find publications that satisfied the inclusion criteria. Two-thirds of the selected articles were published in the last 7 years. Furthermore, studies carried out in recent years (between 2009 and 2015) achieved a higher QHES average score than those published previously. AZA is the most frequently considered active ingredient for which PT were evaluated (seven studies out of the 27 included here). Five of these seven evaluations were published between 2002 and 2006, and the latest article was published in 2014. TMPT, which predicts the potential effectiveness of AZA application, is the most commonly evaluated biomarker. Six of the nine studies focusing on TMPT were published between 2002 and 2006. Over two-fifth of the studies included here evaluated the CE of PT-guided therapy in oncological diseases. Table 5 shows the subdivision of the included studies according to the main categories as well as QHES average score and range in the corresponding category.

Table 5 Number of studies in the main categories

Cost-effectiveness of Pharmacogenetics Testing in Specific Therapeutic Areas

Epilepsy

The cost-effectiveness of pharmacogenetics testing in the treatment of epilepsy was evaluated in three studies. The latest study from Plumpton et al. [50] focused on the HLA*A*31:01 allele screening test. An ICER of £37,314 (US$53,674) per cutaneous avoided ADR for a prior HLA*A*31:01 allele test and carbamazepine (CBZ) administration following the test result was calculated. Studies from Dong et al. [37] and Rattanavipapong et al. [52] also examined the CE of PT prior to CBZ administration; however, these analyses aimed at identifying the presence of the HLA-B*15:02 allele. Rattanavipapong et al. [52] examined the influence of prescribing CBZ with and without prior HLA-B*15:02 allele test for epilepsy as well neuropathic pain. In the case of epilepsy, they calculated an ICER of THB 220,000 (US$7066) per QALY, while for neuropathic pain, the ICER was THB 130,000 (US$4137) per QALY, gained through PT and CBZ administration following the test results. Dong et al. [37] investigated the CE of HLA-B*15:02 allele testing prior to initiation of CBZ therapy in Singapore. In comparison with no testing and CBZ prescription to all patients, the test result-based CBZ administration achieved an ICER of US$29,750. The frequency of HLA-B*15:02 allele differs between the three major ethnical populations present in Singapore. Therefore, separate ICERs were calculated for each of these groups. The test strategy led to an ICER of US$37,030 per QALY for Singapore Chinese, an ICER of US$7930 per QALY for Singapore Malays, and an ICER of US$136,630 per QALY for Singapore Indians. Regarding the US$50,000 threshold, PT before CBZ administration is cost-effective for Singapore Malays and Singapore Chinese.

HIV/Aids

All HIV/AIDS studies included here analyzed the CE of HLA-B*57:01 allele test before abacavir (ABC) administration. Hughes et al. [42] compared the CE of HLA-B*57:01 allele test prior to ABC prescription (patients with a positive test result received an alternative treatment and patients without HLA-B*57:01 allele were treated with ABC) with that of patients treated with ABC but not tested. A dominant ICER was determined in the first group. However, the incremental CE depends on the costs of the alternative treatment: based on the costs of the highly active antiretroviral therapy (HAART) alternative, a range of dominant ICER (alternative treatment is less expensive and more effective) up to an €22,811 (US$26,714) per avoided HSR was calculated.

Schackman et al. [53] determined an ICER of US$36,700 per QALY for a previous HLA-B*57:01 allele test and a test result-based treatment in comparison with no testing.

On the other hand, Nieves Calatrava et al. [48] assessed an ICER of €630.16 (US$807) per avoided HSR, and Kauf et al. [44] calculated an even lower ICER of only US$328 per avoided HSR for a HLA-B*57:01 allele test-based ABC treatment (as opposed to the prescription of ABC without a predictive test).

The latest published study by Kapoor et al. [43] provides a detailed analysis for HLA-B*57:01 allele testing before ABC prescription in three ethnicities. Furthermore, differential results regarding the disease stage (early and late stage) and the treatment strategy (tenofovir and ABC can be prescribed as first-line treatment while some patients were contraindicated to tenofovir) were described. For early stage treatment, where tenofovir and ABC can be prescribed as first-line, the CE for a HLA-B*57:01 allele test-based ABC treatment (in contrast to administration of ABC without testing) resulted in an ICER of US$415,845 per QALY for Han-Chinese, an ICER of US$318,029 per QALY for Southeast-Asian Malays, and ICER of US$208,231 per QALY for South-Asian Indians. For this treatment line, where both active ingredients were prescribed, a CE analysis was also performed for patients at a later stage of the disease. In the latter case, ICERs of US$926,938 per QALY for Han-Chinese, of US$624,297 per QALY for Southeast-Asian Malays, and of US$284,598 per QALY for South-Asian Indians were calculated. This study also included a CE analysis for these three patients groups contraindicated for tenofovir. For the early stage treatment group, ICERs of US$252,350 per QALY for Han-Chinese, of US$154,490 per QALY for Southeast-Asian Malays, and of US$44,649 per QALY for South-Asian Indians were analyzed. For patients at a later stage of the disease, ICERs of US$757,270 per QALY for Han-Chinese, of US$454,223 per QALY for Southeast-Asian Malays, and of US$114,068 per QALY for South-Asian Indians were found. This study indicates that a predictive test prior to ABC administration is not effective, independently of the disease stage. Exceptions are tenofovir-contraindicated early-stage patients.

Immunology

Inflammatory Bowel Diseases

Winter et al. [56] conducted a CE analysis for a PT, which analyzed TMPT activity. The dosage of AZA is based on TMPT activity. Hence, a standard AZA dose without prior testing was compared to an activity-based AZA dosage administration. Costs of £487 (US$776) per LSY for a 30-year-old patient and of £951 (US$1515) for a 60-year-old patient were determined.

On the other hand, Dubinsky et al. [39] and Priest et al. [51] identified CE for a genotype test-based TMPT activity initiation of AZA, compared to administering a standard dosage of AZA without a prior predictive test. Furthermore, Priest et al. [51] compared the phenotypic and genotypic testing and showed that the phenotypic TMPT test strategy was the most cost-effective approach.

Rheumatologic Conditions (Rheumatoid Arthritis and Systematic Lupus Erythematosus)

Marra et al. [47] and Oh et al. [49] evaluated the CE of PT in the therapeutic area of rheumatologic conditions. In both studies, administering a TMPT test result-based dose of AZA is more effective and less costly than administering a standard dose of AZA without prior testing.

Idiopathic Pulmonary Fibrosis

Hagaman et al. [22] evaluated the CE of TMPT testing in idiopathic pulmonary fibrosis. The performance of a TMPT test and the test result-based AZA dosage (in contrast to the administration of a standard dose AZA without prior TMPT test) resulted in an ICER of US$29,663 per QALY.

Autoimmune Disease

Thompson et al. [12] investigated the CE of TMPT testing prior to AZA administration in autoimmune diseases. An incremental cost of £421.06 (US$625) and an incremental net benefit of £256.89 (US$381) for TMPT activity test prior to AZA administration (in contrast to the administration of a standard dose of AZA without TMPT test) were determined.

Oncology

Breast Cancer (Early Stage)

Lyman et al. [46] investigated the CE of PT in early stage breast cancer relative to the recurrence of the disease. A comparison between testing the risk of relapse and administration of the standard therapy, consisting of tamoxifen and chemotherapy, was conducted. Patients at low risk of relapse only received tamoxifen, the others tamoxifen and chemotherapy. Lyman et al. [46] determined an ICER of US$3385 per LYS (no indication of age), whereas Hall et al. [41] indicate an ICER of US$8852 per QALY (patients above 60 years of age). In this study, Hall et al. [41] concluded that a general statement on the cost-effectiveness could not be made because of substantial uncertainties.

Blank et al. [34] investigated the CE of PT in early stage breast cancer prior to administration of trastuzumab. In this study a comparison of a test result-based administration of trastuzumab and the administration of the drug without a prior test was conducted. In the test strategy, patients with proven HER2 overexpression received trastuzumab, whereas patients without HER2 overexpression received an alternative therapy. Two testing procedures were considered: immunohistochemistry (IHC test) and fluorescence in situ hybridization (FISH test). The therapy with both tests alone or in combination (compared with no previous test) had significantly lower costs, but the FISH test alone was considered the most cost-effective approach. However, administering trastuzumab with no previous test achieved a higher benefit, as a result of the imperfect sensitivity and specificity of the tests. A CE ratio was not calculated.

Metastatic Breast Cancer

Elkin et al. [40] evaluated the CE of PT prior to trastuzumab administration in metastatic breast cancer. HER2 overexpression test prior to trastuzumab prescription was compared with the prescription of trastuzumab and chemotherapy without a predictive test. Patients with HER2 overexpression received a combination treatment, consisting of trastuzumab and chemotherapy. Patients without HER2 overexpression only received chemotherapy. In this study, IHC and FISH tests were used to determine HER2 overexpression. The use of a FISH test resulted in a dominant ICER. Furthermore, performing the IHC test before the FISH test was the most cost-effective approach. However, the benefit provided by this strategy compared to trastuzumab administration without prior test was less.

Metastatic Colorectal Cancer

Shiroiwa et al. [54] analyzed the CE of a PT prior administration of cetuximab in metastatic colorectal cancer. A comparison of KRAS mutation test and a result-based administration of cetuximab (patients with wild-type KRAS received cetuximab and patients with KRAS mutations received best supportive care, BSC) and cetuximab treatment without a predictive test were conducted. A dominant ICER for the testing strategy was determined.

Vijayaraghavan et al. [55] determined the cost-effectiveness of a KRAS mutation test prior to administration of cetuximab monotherapy, treatment with cetuximab in combination with chemotherapeutics, and panitumumab monotherapy. Patients with a KRAS mutation received exclusively chemotherapeutics in combination therapy and BSC for monotherapy. The use of a KRAS mutation test before prescription of cetuximab monotherapy, panitumumab monotherapy, and cetuximab combination therapy achieved a dominant ICER compared to the treatment without the predictive test.

Blank et al. [35] evaluated the CE for a KRAS mutation test and a subsequent BRAF gene test before administration of cetuximab in combination with BSC for metastatic colorectal cancer. Patients with a KRAS or BRAF mutation received exclusively BSC. The subsequent verification of BRAF status after KRAS test was the most cost-effective approach compared to treating all patients without testing or solely after the KRAS test. However, perhaps as a result of the imperfect sensitivity and specificity, there was a higher benefit in prescribing cetuximab without a prior test compared with the test strategies. An ICER for a predictive test prior cetuximab administration as compared to without prior testing and treating all patients with cetuximab was not reported.

Behl et al. [33] also evaluated the CE of a subsequent BRAF gene test in addition to a KRAS mutation analysis prior to cetuximab administration in combination with BSC. The subsequent verification of BRAS status after the KRAS test was also the most cost-effective approach. However, even in this case, perhaps as a result of the imperfect sensitivity and specificity of the testing procedures, cetuximab without a prior test led to a higher benefit. An ICER was not stated.

Acute Lymphoblastic Leukemia

Van den Akker-van Marle et al. [32] conducted a CE study for a PT prior to mercaptopurine administration in acute lymphoblastic leukemia in children. There, an ICER of €4800 (US$5702) per LYG for a genotypic TMPT activity test and TMPT activity-based mercaptopurine dosage, compared to no testing and administration of a standard initial dose of mercaptopurine, was determined.

On the other hand, in the study by Donnan et al. [38] neither a phenotypic nor a genotypic test for determining TMPT activity prior to mercaptopurine administration proved to be cost-effective (higher costs for the same benefit).

Advanced Non-Small Cell Lung Cancer

Carlson et al. [36] conducted a CE study for a PT prior to erlotinib administration in advanced non-small cell lung cancer patients. A comparison was made between the use of an EGFR test and the result-based erlotinib administration in patients with EGFR mutations or an alternative therapy for patients without EGFR mutation, and the treatment of all patients with erlotinib without a prior test. An ICER of US$162,018 per QALY for the use of a gene copy number test was determined. The ICER clearly surpassed that of the study set threshold of US$100,000 to US$150,000 per QALY.

De Lima Lopes et al. [45] evaluated the cost-effectiveness of the EGFR test prior to gefitinib prescription. A dominant ICER for the comparison of the use of an EGFR test prior to gefitinib administration and no testing while prescribing chemotherapy with subsequent gefitinib administration was determined. In the test strategy, patients with an EGFR mutation received gefitinib followed by chemotherapy as second-line therapy. Patients without EGFR mutation received chemotherapy with subsequent BSC.

Main Results of This Systematic Review

In this systematic review, six main results were obtained:

  1. 1.

    In the majority of studies, a PT-guided administration of an active ingredient was found to be cost-effective or leads to cost savings.

  2. 2.

    A general statement on CE for a test-guided application of an active ingredient (independently of the indication for which it has been prescribed) was not observed.

  3. 3.

    The majority of studies analyzed the CE of targeted therapies in oncological diseases.

  4. 4.

    The CE depends on various factors (e.g., prevalence of biomarkers, test costs, threshold value, prevalence of ADRs, response rate of therapy).

  5. 5.

    The CE of a PT-guided therapy can differ between indications as well as within the same indication.

  6. 6.

    The results depend on the perspective of the study (society, healthcare system, and payer).

Discussion

This comprehensive review analyzed the CE of PT-guided therapies. For this propose we included only studies that compared the CE of the administration of an active ingredient with or without a prior predictive test. PTs serve to determine the effectiveness of active ingredients, to take a therapeutic decision, and ultimately to optimize patient benefit by avoiding ADRs. Preventing ADRs leads to an increase in drug safety and is therefore the central argument for the application of PTs [57, 58]. However, the usefulness of such pharmacogenetic tools depends on their CE. CE analyses are essential for reimbursement decisions of new technologies as well as pricing by decision-makers. This review investigated whether PTs contribute to an efficient therapy management.

An average value of 85.81 for all 27 assessed studies was calculated. The evaluation through the QHES instrument is a quality assessment regarding the methodology of the studies. This evaluation considered the specific stratified medicine inadequate. Important criteria in the assessment of PTs are the prevalence of biomarkers, sensitivity, and specificity of the test, as well testing costs.

Generally, innovations are used if they have a significant influence on the outcomes (e.g., on the survival or on the improvement in the quality of life). As a result of the limited healthcare budget, it is essential to assess the additional benefits of the innovation in comparison with previous standards. Therefore, CE analyses are necessary and were used for reimbursement decisions. The CE of a medical intervention depends on whether it will be able to provide benefits at a reasonable cost. CE analyses estimate the ICER of interventions. ICER is an analytical tool of the CE analysis (CEA), which compares the differences in cost of two treatments based on their different outcomes (e.g., new treatment vs. previous treatment). Threshold values vary from country to country. For example, a threshold of US$50,000 is stated as cost-effective in the USA [59]. An intervention with an ICER of less than US$50,000 per additional QALY is classified as cost-effective. The CE depends on several factors. In this comprehensive review some divergent features in the study design, which influenced the CE, were identified.

Perspective of the study The CE of a study depends, among other things, on the chosen perspective (e.g., healthcare system, society) [60]. The missing consideration of indirect cost allows no final assessment and comprehensive interpretation. Ideally, the cost should be collected from a societal perspective. However, for this purpose, the required costs are difficult to quantify (e.g., loss of wages) [61].

Time horizon/discounting Different CE values arise because of the various time horizons. For the consideration of ADRs, a time horizon of 1 year would be sufficient. This is because ADRs caused by pharmacogenetic applications immediately appear after the active ingredient has been administered [62]. A defined time horizon would lead to an improved comparability. In contrast, for the consideration of pharmacodynamic effects, a life-long time horizon should be considered, since the costs for long-term consequences or the avoidance of them have a considerable importance.

Impact of sensitivity and specificity of the test procedures Weaknesses in the sensitivity and specificity of the predictive tests may influence the CE of a strategy. Sensitivity and specificity are characterized by a great heterogeneity. This could lead to an incorrect classification as responder or non-responder. Thus, it may result in the administration of ineffective drugs, undesirable effects, or the exclusion of an effective therapy. Generally, this implies losses of effectiveness for the relevant therapy.

Prevalence of biomarkers Biomarker prevalence in the specific study populations is based on different assumptions. Dong et al. [37] differentiated the study population according to allele frequencies. The HLA-B*1502 allele frequencies differ between various ethnic groups. The corresponding classification leads to an increased degree of stratification. Fundamentally, a lower biomarker prevalence leads to a lower CE of the PT [63]. According to the lower likelihood to identify a responder, the overall benefit is low. Homogenous groups enable an increase in test validity or the likelihood to identify a responder, as well as the examination of biomarker prevalence values by sensitivity analysis.

Costs of testing procedures Various yearly prices, countries, test characteristics, lack of transparency on test prices, as well as often used estimates, reduce the possibility of comparing the costs of testing procedures. Sensitivity analyses of the price may reduce the incomparableness. Possible future cost reductions of PTs will have a positive impact on the CE.

Lack of evidence-based data The data used for CE evaluations are partially of insufficient quality and quantity. The evaluations often derived from retrospective studies. Randomized controlled trials (RCT) enable the generation of evidence-based data and provide a valid basis for CEA. RCTs are regarded as the gold standard of data collection [64, 65]. The main problems in this context are low funding, low interest in clinical trials (except studies for approved medications), small patient populations, as well as lack of valid discoveries [66]. It is difficult to conduct an RCT for pharmacogenetic applications. The anticipated differences in treatment effectiveness accompanying the test strategies and the need to generate significant outcomes in patients with a similar genotype require large group sizes [67].

Oncology is the most frequently discussed disease area for CEA. This indication area is characterized by the high toxicity of chemotherapeutic agents as well as poor clinical outcomes [68, 69]. This raises the potential to be one of the largest and most attractive fields for pharmacogenomics application. Oncology is particular well suited to show CE, because it is an area with a large number of affected patients and with expensive cancer-associated outcomes (chronic pain, ADRs, death). Minor improvements of outcomes affect the CE, because expensive outcomes such as long hospital stays can be prevented.

There are some economic, clinical, and practical challenges in connection with the development and the application of PTs. Research and development of pharmacogenetic applications is characterized by some regulatory challenges [70, 71] and high costs to prove clinical benefits [72]. There is a disincentive for pharmaceutical companies to invest in companion diagnostics [73, 74]: an investment into a market without free pricing is a risk for pharmaceutical companies. Genetic analyses (subgroup analysis) divide the market and reduce the total turnover. In countries without the possibility of dynamic pricing or changes in price according to subgroups or indications, the different value of PTs for the specific subgroups is appropriate. A general problem of personalized medicine is the development of drugs for small patient groups but with the same costs of the research and development needed for the development of drugs for larger groups [75]. The danger of low total turnover by small user groups hinders further research and development in the field of targeted therapies. Therefore, in areas with larger market segmentations, pharmacogenetic research should be financed by public resources [76]. Moreover, payers link pharmacogenetic applications with concerns. PTs as well as proteomic tests seem to be more expensive than conventional diagnostic and prognostic tools [77]. Actually, only a few pharmacogenetic examinations were financed within the uniform value scale, on the basis of pricing of ambulant services (EBM). Performing a PT for eight of 47 active ingredients is not compulsory. For 10 of these 47 active ingredients CEA were conducted. The insufficient basis for a conclusion can be used as a reason for the restrained reimbursement for PTs.

Furthermore, the clinical benefit of an intervention (e.g., CE, net benefit) is an essential prerequisite for PT application. However, because of the lack of evidence for the correlation between the influence of a PT on the clinical outcome [78], it is difficult to prove the benefit. No test can perfectly predict whether a patient will respond positively to a particular treatment. Various factors influence the therapeutic outcome. Generally, ADRs often occur immediately after treatment [79]. Thus, the outcomes (e.g., cost per avoided ADR) can be quickly and easily observed [61], especially in oncological studies. Moreover, the effects also depend on monitoring ADR quality.

Some practical challenges are connected with the routine use of PT. The partly missing reimbursement [27, 80], the lack of clinical guidelines [81], and the processing time associated with treatment delays [82] preclude their widespread application. Furthermore, the use of PT essentially depends on its acceptance by physicians [83]. The restrained use of PT is the result of the missing clinical validation for the clinical application as well of the missing practical and standardized guidelines [84]. There are also ethical concerns regarding the use of PT. Patients were excluded from target therapies as a result of the test results. The insufficient sensitivity and specificity of PTs may lead to a wrong stratification and therefore to the lack of an effective treatment.

The costs of the tests and which savings could be achieved through the use of predictive tests must be known. If there are higher savings, it is economically sensible to conduct a PT. In modelling the CE of PT, important factors such as the sensitivity and specificity of these tests, degree of gene penetrance, association between genotype and clinical outcome, genotype prevalence in the population, likelihood for ADR, and survival according to the genotype and the treatment strategy should be considered.

The quality assessment through the QHES may be subjective and may represent a major limitation of this study. The assessment of study aspects is easy to determine. In contrast, aspects which aim to evaluate the adequacy are characterized by variances. Therefore, two researchers performed the assessment independently to minimize this subjectivity of the QHES instrument.

National and international standards for the assessment of PT should be defined and implemented to improve the quality of the study. Uncertainties may be decreased by more accurate estimations of effectiveness and costs [85]. Furthermore, an independent financing system (e.g., public financing) could enhance the credibility of the results. Such studies are focusing not solely on effectiveness but also on efficiency.

Conclusion

The application of personalized therapies is partly associated with high economic costs. This review has demonstrated that, in the majority of the studies included here, test-guided personalized therapies are more cost-effective than non-test-guided personalized therapies. Hence, a prior test before drug administration seems to be useful for therapeutic decisions, dosing according to the different genotypes or gene activity, and/or reducing adverse drug reactions. However, the results of the studies are mainly influenced, e.g., by sensitivity and specificity of the test procedures, prevalence of biomarkers, and the perspective of the study. Generally, analyses of the CE are an essential part of the reimbursement recommendations. However, to guarantee a comparability of CE of stratified drug therapies, national and international standards for evaluations studies should be defined.