Background

In the United States (US), adverse drug reactions (ADRs) are the fourth to the sixth leading cause of death, with approximately more than 100,000 deaths per year [1]. Besides this, there is a similar trend in Europe, with approximately 5% of all hospitalizations and 197,000 deaths annually reported [2, 3]. The most severe life-threatening ADRs are the Stevens-Johnson syndrome (SJS) and toxic epidermal necrolysis (TEN). The majority of cases are caused by reactions to certain drugs, e.g., allopurinol, sulfa-drugs or carbamazepine. Moreover, ADRs could result in substantial economic burden. The annual economic impact of severe and fatal ADRs leading to mortality and morbidity was found to be exceptionally high, totaling nearly $177 billion in the US and €79 billion in Europe [2, 4].

Nowadays, there are several methods for investigating people who are at risk of ADRs according to clinical features, such as renal or liver function, age, dosage of administration, as well as identification of any drug interaction. Genetic factors can also be the cause of ADRs, which accounted for approximately 10–20% [5]. Genetic information obtained from polymorphism-based pharmacogenomics or pharmacogenetics is highly crucial to better identifying responders and non-responders to medications, as well as people who are at risk of ADRs or drug inefficacy prior to prescription [6]. Moreover, there has been an increasing number of genetic associations to develop clinically useful tests through international guidelines, including the Clinical Pharmacogenetics Implementation Consortium (CPIC), the Royal Dutch Association for the Advancement of Pharmacy - Pharmacogenetics Working Group (DPWG) and the Canadian Pharmacogenomics Network for Drug Safety (CPNDS).

The CPIC has developed guidelines on evidence-based pharmacogenetic testing for 54 drugs-gene pairs, which supported and guided the translation of clinically relevant aspect. In addition to this, pharmacogenetics-based therapeutic recommendations for 94 and 8 drugs-gene pairs were published by DPWG and the CPNDS, respectively [7]. It is important to note that the application of pharmacogenetic information before prescribing the corresponding medication is beneficial to avoid serious ADRs or to guide genotype-specific dosing, thereby enhancing the effective use of drug treatment. Therefore, national drug agencies have approved drug labels containing pharmacogenetic information. As of August 1, 2020, 335, 134, 105 and 52 drug labels were approved by the United States Food and Drug Administration (US FDA), the European Medicines Agency (EMA), the US Health Care Service Corporation (HCSC) and the Pharmaceuticals and Medical Devices Agency (PMDA) of Japan, respectively [6, 7].

Currently, the most important criteria for its implementation in clinical practice is not only clinical evidence of pharmacogenetic testing, but also its value for money, which can be proved by an economic evaluation being a vital tool used to inform resource allocation in the decision-making process, especially in developed countries [8, 9]. Therefore, the quality of methodological rigor in cost-effectiveness studies is required to increase the reliability of such studies. Until now, there were two systematic reviews specifically focusing on economic evaluation of pharmacogenetic testing to prevent ADRs [10, 11]. The first review published in 2008, [10] was aimed at determining the cost of thiopurine methyltransferase (TPMT) genotyping per averted case of neutropenia. It did not, however, evaluate the quality of included studies. Later in 2016, another review, [11] which included all studies up to 2015, assessed the quality of studies in terms of their reporting and evidence of clinical effectiveness of testing but did not include other parameters. Notably, it has been suggested that the sources of evidence for clinical effectiveness, baseline clinical value, resource utilization, cost and utility data, all of which can influence and contribute to biased estimates of economic evaluation results, should be taken into account [12].

Therefore, this review aimed to update a systematic review and critically appraise the quality of existing economic evaluations of pharmacogenetic testing to prevent ADRs, in terms of reporting and sources of evidence used for all significant model inputs, such as clinical effectiveness, baseline clinical data, resource use, cost and utilities. Due to methodological differences across studies, as well as population-level and system-level differences, our findings could assist in identifying the potential model parameters that could influence the cost-effectiveness results and their transferability across geographic regions. They could also be valuable in a future and robust cost-effectiveness analysis of pharmacogenetic testing to prevent ADRs, which might help policy-makers make better decisions on allocating resources effectively and implement such testing into clinical practice.

Methods

A systematic review protocol was initially registered with PROSPERO, an international prospective registry of systematic reviews (identification number CRD42019142060) and available from: http://www.crd.york.ac.uk/PROSPERO/display_record.php?ID=CRD42019142060. The present systematic review was conducted in accordance with the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) [13].

Identification of studies

We conducted a systematic search in Medline (via PubMed), Scopus and the Centre for Reviews and Dissemination (CRD)‘s National Health Service Economic Evaluation Database (NHS EED) to identify relevant studies up to October 2019. The search terms were constructed based on the PICOS domains (patient, intervention, comparison, outcome and study type). The search terms were comprised of the domains on the intervention (pharmacogenetic testing and ADRs) and study type (economic evaluation). There were no restrictions in the domains of patients, comparators and outcomes. The search terms were explicitly used for each search engine and search strategies for each database, as stated in the online appendices (Electronic Supplementary Material) Table A1. The reference lists of the retrieved studies were also explored to identify further studies. The search was updated every six months.

Selection of studies

Two reviewers (ST and OR) independently selected studies by screening titles and abstracts of all articles based on the eligibility criteria. Full texts of articles identified in the initial screening were retrieved. The studies were included if they met all of the following criteria. First, studies were included if they investigated pharmacogenetic testing of human genetic variations, which guided drug therapies to prevent ADRs. Second, the study type was an economic evaluation, e.g., cost-effectiveness analysis (CEA), cost-utility analysis (CUA), cost-benefit analysis (CBA) or cost-minimization analysis (CMA). Studies were excluded if the drug and pharmacogenetic testing were not on the list of the available clinical practice guidelines (e.g., CPIC, DPWG, CPNDS), or if the prescribing information for labelling was not approved by the US FDA as of August 1, 2020. Any disagreements were resolved through discussion.

Data extraction

Data were extracted independently by two authors (ST and OR) using a data extraction form, which included the study characteristics, author, year of publication, setting, target populations, intervention, comparator, marker frequency, methods, perspective, time horizon, discounting, uncertainty analysis, and outcome measures, in terms of incremental cost, incremental cost per quality adjusted life year (QALY) or life year (LY) gained or cost per adverse reaction/event avoided. We also gathered the parameters, which may affect the cost-effectiveness results according to the uncertainty analysis results of individual studies.

Quality assessment of economic evaluation reporting

The quality of economic evaluation reporting was appraised using the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) checklist with 24 items [14]. Two independent authors (ST and OR) assessed the quality of reporting and any disagreements were resolved through discussion. A percentage of agreement and disagreement by checklist item was calculated. We evaluated the quality of reporting along with the CHEERS checklist by rating scores as follows: the study met all standards (score = 1), the study met some standards (score = 0.5), the study did not meet the standards (score = 0), or the study was not applicable (N/A). For instance, in the checklist item indicating whether the study reported time horizon and described its appropriateness, a score of 1, 0.5 and 0 will be given if the authors met all standards (i.e., they reported both the time horizon and reason why it was appropriate), if they met some standards (i.e., they reported either the time horizon or description of its appropriateness), and if they did not meet the standards (i.e., failed to report both).

Quality assessment of evidence used

The quality of evidence for input parameters used in economic evaluations, such as clinical effect sizes, baseline clinical data, resource use, costs and utilities (for cost-utility analyses) was assessed using the hierarchy of data sources developed by Cooper et al. [12]. Each item was evaluated and given a rank ranging from 1 to 6, and 9 was applied to a source which was not clear. For example, for parameters related to clinical effect sizes, rank 1+ or 1 was given if the data were obtained from a meta-analysis of randomized controlled trials (RCTs) or single RCT with direct comparison measuring final outcomes, respectively. Other rates included: rank 2 (a single RCT with direct comparison measuring surrogate outcomes), rank 3 (a single placebo RCTs measuring surrogate outcomes), rank 4 (case control or cohort studies), rank 5 (case report or case series) and rank 6 (expert opinion). Two authors (ST and OR) independently assessed and ranked data sources of the input parameters based on the hierarchy of data sources published by Cooper et al. [12]. Any disagreements were resolved through discussion and a percentage of agreement and disagreement by input parameter was calculated.

Transferability assessment of economic evaluation studies

We applied the transferability method developed by Welte et al. [15] to identify potential transferability factors across countries, which can be categorized into three groups: (1) methodology (i.e., perspective, time horizon, cost categories, and discount rate) (2) healthcare system (i.e., practice variation and technology availability), and (3) population characteristics (i.e., disease incidence/prevalence, life expectancy, acceptance and compliance). Based on our review, economic evaluation studies of pharmacogenetic testing conducted across countries were selected as a case-study to assess whether different transferability factors could directly affect the difference in costs and outcomes of the economic evaluation results.

Results

Search results

A total of 6718 studies were searched from Medline (1544 studies), Scopus (3010 studies) and Cochrane Database of Systematic Reviews (CDSR) (2164 studies). After excluding 816 duplicates, 5902 studies were screened for titles and abstracts. From these, 5824 studies were excluded for several reasons. The most common reasons were “non-genetic interventions” and “non-drug-related ADRs”, as described in Fig. 1. A total of 64 studies met the inclusion criteria. Nevertheless, five studies were excluded since they were not included in the list of available clinical practice guidelines. Finally, 59 studies were eligible for data extraction.

Fig. 1
figure 1

PRISMA flow of study selection process

Characteristics of the included studies

The general characteristics of all included studies are presented in online appendices (Electronic Supplementary Material) Table A2. All studies were published between 2002 and 2018. CUA was the most frequent type of economic evaluation (41 studies) (70%), followed by CEA (10 studies) (17%), CBA (5 studies) (8%), and then CMA (3 studies) (5%). Table 1 demonstrates the number of studies categorized by therapeutic area-gene and ADRs, as well as by region. Majority of the studies were conducted in European and American countries (43 studies) (73%), while studies related to HLA-B*58:01-allopurinol and HLA-B*15:02-carbamazepine were mostly found in Asian countries. Most studies investigated the therapeutic area of cardiovascular diseases (24 studies) [16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39], followed by gout (8 studies) [40,41,42,43,44,45,46,47], human immunodeficiency virus (HIV) infection (8 studies) [48,49,50,51,52,53,54,55], autoimmune diseases (8 studies) [56,57,58,59,60,61,62,63], and epilepsy/neuropathic pain (6 studies) [64,65,66,67,68,69], cancer (3 studies) [70,71,72], major depressive disorder [73], and hormone replacement therapy [74].

Table 1 Number of studies classified by therapeutic area-gene and ADRs and by region

The majority of pharmacogenetic testing to prevent ADRs were CYP2C9 and VKORC1 testing before prescription of warfarin (14 studies) [16,17,18,19,20,21,22,23,24,25,26,27,28,29], CYP2C19 genotype screening for selection of antiplatelet therapy (i.e., clopidogrel) after percutaneous coronary intervention (PCI) for acute coronary syndrome (ACS) patients (9 studies) [30,31,32,33,34,35,36,37,38], and HLA-B*58:01 screening before prescribing allopurinol in patients with gout (8 studies) [40,41,42,43,44,45,46,47]. Moreover, the severity of ADRs related to gene-drug pairs was grouped into two major types: severe ADRs (life-threatening or fatal ADRs) and common ADRs. Pharmacogenetic testing and drugs associated with severe ADRs were HLA-B*58:01-allopurinol induced SJS/TEN/drug reaction with eosinophilia and systemic symptoms (DRESS), HLA-B*57:01-abacavir induced hypersensitivity reaction, HLA-B*15:02 and HLA-A*31:01- carbamazepine induced SJS/TEN/hypersensitivity, TPMT-azathioprine induced severe bone marrow toxicity, UGT1A1- irinotecan induced severe neutropenia and DPYD- fluoropyrimidines induced severe hematologic/GI toxicity. Meanwhile, the others were pharmacogenetic testing and drug-associated common ADRs. Most of the genetic information regarding gene-drug pairs was published by the CPIC guideline [75] and drug labels were approved by the US FDA, except for the study determining Factor V Leiden screening before receiving estrogen combined with oral contraceptives [76].

Quality assessment of economic evaluation reporting using the CHEERS checklist

The quality of economic evaluation reporting using the CHEERS checklist [14] is summarized in online appendices (Electronic Supplementary Material) Table A3. All included studies clearly described the study population, measurement of effectiveness based on single study or synthesis estimated, as well as approaches for estimating resource use and costs with the percentage of agreement between two independent authors ranging from 81 to 100%, indicating that the score rating for each item was reliable. In contrast, only 22% of the single (trial) study-based performed uncertainty analysis of the parameters and evaluated their effects, as detailed in Table 2. Notably, 18 studies (31%) adopted a health service or healthcare payer’s viewpoint in the analysis, while 10 studies (17%) and 9 studies (15%) presented societal and healthcare system’s perspectives, respectively. There were nine studies (15%) that did not mention the study’s perspective [27, 28, 32, 41, 49, 57, 59, 66, 74]. The time horizon used for cost and consequence evaluation ranging from six weeks to a lifetime was also reported in the above studies, while five studies (8%) did not state the time horizon [32, 48, 61, 66, 70]. Among the studies that specified a time horizon exceeding one year, there were seven studies (18%) that did not report the discount rate for costs and outcomes [32, 39, 48, 49, 61, 66, 70].

Table 2 Quality assessment results of economic evaluation reporting using the CHEERS checklist

For clinical effectiveness data, 50 studies (85%) used a single, study-based estimate and nine studies (15%) used synthesis-based estimates. All studies clearly described the source of evidence. Most studies (50 studies; 85%) were conducted based on the decision-analytic model, while nine studies (15%) used a single study-based economic evaluation. Seven studies (78%) including retrospective, observational, or RCT-based economic evaluations studies did not mention uncertainty analysis [27, 32, 41, 48, 59, 61, 66] as they did not indicate confidence intervals denoting uncertainty measures. However, all model-based economic evaluations performed uncertainty analysis. More than half (26 of 50 model-based studies; 52%), performed both one-way sensitivity analysis and probabilistic sensitivity analysis (PSA), while some conducted the one-way sensitivity analysis only (6 studies; 12%) and PSA only (6 studies; 12%). There were 12 studies (20%) and 13 studies (22%) that did not describe any source of funding [21, 28, 40, 41, 48, 51, 54, 57, 59,60,61,62] and the potential for conflict of interest of study contributors [29, 37, 43, 45, 52, 54, 57, 59,60,61, 63, 71, 74], respectively.

Quality assessment of evidence used

Only 16 studies (27%) obtained clinical effectiveness data of testing from high-quality evidence (a single RCT with direct comparison, rank 1), while about half (49%) retrieved evidence from case-control or cohort studies (rank 4). Nevertheless, 8 studies (14%) used clinical effectiveness data of testing from the meta-analysis of case-control study with direct comparison, which was not listed in the hierarchy of data sources by Cooper et al. [12]. Moreover, only four (7%) and one studies (2%) applied the baseline clinical data from a high-quality evidence [case series specifically performed for the study (rank 1) and the analysis of administrative databases including only patients in interested settings (rank 2)]; whereas most studies (85%) obtained clinical effectiveness data of testing from old case series, analysis of reliable administrative databases, or estimated from RCTs (rank 4). On the other hand, most studies sourced resource use and cost information from a high-quality evidence [prospective data analysis conducted for specific study (rank 1) or recently published cost estimation based on reliable databases (rank 2)], except for one study on HLA-B*57:01-abacavir, which used data from expert opinions (rank 6) [52]. For CUA studies, the utility data were mostly (93%) estimated from a direct utility from a previous study in patients with the disease of interest (rank 3). Only one study for CYP2B6-efavirenz did not define the data source (rank 4) [53]. However, there was no evidence obtained from indirect utility or expert opinion. Overall, the percentage of agreement between two independent authors ranging from 81 to 97%, suggesting that the ranking of data sources was reliable, see Table 3.

Table 3 Quality assessment results of evidence used based on the hierarchy of data sources by Cooper et al. [12]

Cost-effectiveness results

In terms of pharmacogenetic testing and drugs associated with ADRs in particular disease areas, such as cardiovascular diseases, gout, HIV infection, autoimmune diseases, epilepsy/neuropathic pain, cancer, major depressive disorder, and hormone replacement therapy, the results of economic evaluation studies are summarized and presented in Table 4.

Table 4 Cost-effectiveness results of included studies

Cardiovascular diseases

CYP2C9 and VKORC1 and warfarin-induced risk of bleeding

There were 14 economic evaluation studies of CYP2C9 and VKORC1 testing before prescription of warfarin to prevent risks of bleeding. Over the prior from 2004 to 2017, these studies were conducted in Korea (1 study) [16], UK and Sweden (1 study) [17], Croatia (1 study) [18], Thailand (1 study) [21], Sweden (1 study) [20], Netherlands [28] and US (8 studies) [19, 22,23,24,25,26,27, 29]. Ten studies were CUA with model-based economic evaluation [16,17,18,19, 21,22,23,24,25,26] and another study was CUA with a trial-based [20] economic evaluation to estimate resource use associated with the interventions. Two studies were CEA with model-based [28, 29] and one study was CEA based on a retrospective study [27]. Seven studies were explicitly conducted in patients with atrial fibrillation and one study investigated mechanical heart valve replacement (MHVR). The rest of the studies were used in newly initiated warfarin therapy. Nine studies showed that CYP2C9 and VKORC1 testing to prevent the risk of bleeding would be a cost-effective intervention [16,17,18,19,20, 22, 27,28,29] (i.e., less costly and more effective than treatment without genotyping). One study suggested that testing would be cost-effective if it increased the time spent in the target international normalized ratio (INR) range during the first three months of treatment by 5 to 9% [25]. Nevertheless, four studies from Thailand [21] and US [23, 24, 26] suggested that those testings would not be cost-effective due to the effectiveness of testing in reducing out-of-range INR values.

CYP2C19 and clopidogrel-induced major adverse cardiovascular events (MACE)

There were nine studies which conducted an economic evaluation of CYP2C19 testing before prescription of clopidogrel to avoid cardiovascular events in patients with ACS undergoing PCI. They were performed in Hong Kong (1 study) [30], Netherlands (1 study) [32], Australia (1 study) [35], New Zealand (1 study) [38] and US (5 study) [31, 33, 34, 36, 37]. They were conducted from 2012 to 2018. Seven studies were CUA with model-based approach [30, 31, 33,34,35,36, 38] and one study was CUA with trial-based [32] Others were CEA with model-based [37] economic evaluation. All studies showed that CYP2C19 testing would be a potentially cost-effective treatment strategy for avoiding MACE. Nevertheless, some studies considered ticagrelor and/or prasugrel as alternative drugs with a higher cost than clopidogrel for those who tested positive.

Pharmacogenetic testing and statin-induced myopathy

One study developed testing that identified statin-induced myopathy in cardiovascular patients by using CUA with model-based economic evaluation in 2017. The results demonstrated that genotyping would be a cost-effective intervention, with a testing cost of CAN$ 906 that was less than the cost of no testing in Canada [39].

Gout

HLA-B*58:01 and allopurinol-induced SJS/TEN and DRESS

There were eight economic evaluation studies of HLA-B*58:01 screening before prescribing allopurinol in gout patients to prevent SJS/TEN and DRESS) that were performed in China (1 study) [41], Malaysia (1 study) [40], US (1 study) [44], Taiwan (1 study) [43], UK (1 study) [42], Singapore (1 study) [46], Korea (1 study) [45], and Thailand (1 study) [47] from 2014 to 2018. Six studies used CUA with model-based economic evaluation [40, 42,43,44, 46, 47]), one performed CMA with trial-based [41] and one conducted CBA with model-based approach [45]. All studies were conducted in patients with gout, and two studies were explicitly conducted in gout patients with Chronic Kidney Disease (CKD) [43, 45]. Most of the studies considered allopurinol-induced SJS/TEN, but only one study considered both SJS/TEN and DRESS. Five studies applied febuxostat as an alternative drug in the model [41,42,43,44,45] to patients who tested positive with HLA-B*58:01. However, probenecid has been used in Malaysia, Thailand and Singapore [40, 46, 47] as febuxostat is not regularly used in the usual clinical practice.

Three studies showed that HLA-B*58:01 genotyping would be a cost-effective in China, Taiwan and Thailand [41, 43, 47] and cost-saving intervention in the Korean study [45]. Nevertheless, three studies from Malaysia [40], UK [42] and Singapore [46] suggested that HLA–B*58:01 genotyping would not be cost-effective as the cost of the pharmacogenomics testing and alternative drugs, such as febuxostat were too high, and the efficacy of alternative drugs was less than that of allopurinol (e.g., probenecid). Moreover, the study in US [44] showed that genotyping would be cost-effective for Asians and African Americans. However, it would not be cost-effective for Caucasians or Hispanics. Therefore, the incremental cost-effectiveness ratios (ICERs) might vary substantially across racial or ethnic groups, following by their HLA-B*5801 frequency.

HIV infection

HLA-B*57:01 and abacavir-induced hypersensitivity reaction

There were seven economic evaluation studies of HLA-B*57:01 screening before prescribing abacavir for HIV positive patients to prevent hypersensitivity reactions (HSR) that were conducted in Russia (1 study) [48], Singapore (1 study) [49], Spain (1 study) [51], Germany (1 study) [50], UK (1 study) [54], and US (2 studies) [52, 53] from 2004 until 2018. Most studies were conducted in Europe and US [48, 50,51,52,53,54], and only one study investigated the Asian population [49]. Results showed that allele frequencies in Europe and US were much higher (ranged from 3.7–7.3%) than those of the Asian population (1.1% in Han Chinese, 1.8% in Malays), except for Indians (3.6%). Three studies were CUA with model-based [49, 52, 53], two studies were CEA with model-based [51, 54] and one was CBA with model-based economic evaluation [50]. The rest was CMA based on a retrospective study [48]. Four studies in Russia, Germany, UK and US demonstrated that HLA-B*57:01 testing would be cost-effective [53, 54] and cost-saving [48, 50] to prevent HSR due to abacavir as compared with no testing, while the remaining studies showed that it was not cost-effective [51, 52]. In addition, the study in Singapore [48] suggested that genotyping was not cost-effective for Han Chinese and Malays ethnicity but cost-effective in Indian patients. This was because the frequency of the HLA-B*5701 allele and positive predictive value (PPV) in Indians were higher than in Han Chinese and Malays.

CYP2B6 and efavirenz-induced CNS toxicity

The study in US demonstrated that the CYP2B6 genotyping before prescribing efavirenz to prevent central nervous system (CNS) toxicity in HIV patients was cost-saving as compared with no testing due to a lower lifetime cost and a gain in QALYs [55].

Autoimmune diseases

TPMT and azathioprine-induced severe bone marrow toxicity

Azathioprine-induced severe bone marrow toxicity was associated with TPMT in patients with autoimmune diseases, inflammatory bowel disease, idiopathic pulmonary fibrosis (IPF), Crohn’s disease, rheumatoid arthritis or systemic lupus erythematous. Eight studies were carried out in UK (1 study) [56], New Zealand (1 study) [58], Scotland (1 study) [61], Korea (1 study) [62], Canada (2 studies) [59, 63], and US (2 studies) [57, 60] from 2002 to 2014. These studies employed CUA with model-based (2 studies), [57, 58] CUA with trial-based (1 study) [56], CEA with model-based (2 studies), [60, 62] CEA with trial-based (1 study) [61], CBA with model-based (1 study) [63], and CMA with a randomized prospective study (1 study) [59]. Five studies showed that testing would be a cost-effective [57, 60,61,62] and cost-saving intervention [63] to prevent severe ADRs regarding azathioprine as compared with no testing. Nevertheless, two studies from UK [56] and New Zeland [58] suggested that genotyping would not be cost-effective due to higher costs and lower QALYs than azathioprine therapy without testing. In Canada, it was discovered that genetic testing was not cost-saving [59].

Epilepsy/neuropathic pain

HLA-B*15:02 and carbamazepine-induced SJS/TEN

Five economic evaluation studies of HLA-B*15:02 genotyping to prevent the risk of SJS/TEN in patients prescribed carbamazepine (CBZ) were carried out in Malaysia [64], Hong Kong [65], Thailand (2 studies) [66, 67] and Singapore [68] from 2012 to 2017. Four studies applied CUA with model-based economic evaluation [64, 65, 67, 68], while the other study used CBA with retrospective study [66]. All studies focused on the patients diagnosed with epilepsy. Only the study from Thailand [67] included both patients with epilepsy and neuropathic pain. Moreover, the study from Singapore [68] was performed separately for the major ethnic groups, which were Han Chinese, Malays and Indians. Three studies used valproate [64, 67, 68], while the rest of the studies used any anti-epileptic drug as an alternative for those patients who tested positive with HLA-B*15:02 [65, 66].

The findings from three studies showed that a testing would be cost-effective [65, 68] and cost-saving to prevent SJS/TEN in CBZ, as compared with no testing [66]. However, a study in Malaysia indicated that testing would not be cost-effective as a result of ethnicity and an effective alternative drug for those who tested positive [64]. The study in Thailand showed that HLA-B*15:02 screening would be cost-effective in CBZ-treated patients with neuropathic pain but not for epilepsy because the cost of alternative drugs for epilepsy was approximately two times higher than the cost for neuropathic pain [67].

HLA-A*31:01 and carbamazepine-induced SJS/TEN and hypersensitivity

Notably, CBZ has been associated with HLA-A*31:01 and it can lead to severe ADRs, such as SJS/TEN and hypersensitivity. A study in UK was performed using CUA with model-based economic evaluation in 2015. The results showed that testing would be cost-effective as the efficacy (e.g., remission rate) of anti-epileptic drugs was the main driver of cost-effectiveness results [69]. In addition, this study used lamotrigine as an alternative drug for patients who tested positive rather than valproate, which might be different from other clinical settings.

Cancer

UGT1A1 and irinotecan-induced severe neutropenia

One CUA with model-based study from France [71] and one CEA with model-based study from the US [72] were performed to evaluate the cost-effectiveness of UGT1A1 screening before prescribing irinotecan to prevent severe neutropenia in metastatic colorectal cancer. The results demonstrated that genotyping would be a cost-effective intervention.

DPYD and fluoropyrimidines-induced severe hematologic and GI toxicity

One study was conducted in the Netherlands [70] using CBA with model-based economic evaluation in 2016. The results demonstrated that DPYD testing before prescription of fluoropyrimidines would be cost-saving, as compared with no testing, in preventing severe hematologic and GI toxicity due to fluoropyrimidine.

Major depressive disorder

CYP2D6 and nortriptyline-induced anticholinergic symptoms

The CUA with model-based economic evaluation in the Netherland study showed that CYP2D6 screening for adjusting dose before starting nortriptyline compared to no screening would not be cost-effective since CYP2D6 was not potentially related to the reduction of ADRs and to the increased efficacy of nortriptyline in a major depressive disorder [73].

Hormone replacement therapy

Factor V Leiden and estrogen combined in oral contraceptives-induced thromboembolism

The CUA with model-based study in US was conducted to evaluate the cost-effectiveness of Factor V Leiden testing before a prescription of estrogen-containing oral contraceptives to avoid thromboembolism. The study compared testing before prescribing the drug, testing with oral contraceptive pill (OCP) counselling, testing with OCP counselling and anticoagulation (AC) with the usual care without testing. The results demonstrated that testing with OCP counselling and prophylactic AC during high-risk periods in female relatives of FVL carriers was cost-effective [74].

Uncertainty analysis results

Based on the results of uncertainty analysis from the included studies, parameters which could influence the cost-effectiveness results are summarized in terms of therapeutic areas and gene-drug pairs in Table 5. These parameters were classified into three types: (1) epidemiological and disease progression parameters, e.g., probability of ADRs related to drug treatment, allele frequency, PPV or negative predictive value (NPV), and mortality rate of ADRs, (2) clinical effectiveness data, e.g., the efficacy of genetic testing and drugs treatment, and (3) resource use and cost parameters, e.g., costs of genetic testing, alternative drugs and hospitalization.

Table 5 Number of studies reporting parameters which could influence the cost-effectiveness results

Our review indicated that cost-effectiveness results were mostly sensitive to the probability of drug induced-ADRs, the effectiveness of pharmacogenetic testing to prevent ADRs, the cost of testing, and the cost of alternative drugs in patients who tested positive. For instance, in cardiovascular diseases, the probability of MACE due to clopidogrel and the efficacy of CYP2C9 and VKORC1 testing to avoid bleeding complications of warfarin mostly affected the ICER results in clopidogrel and warfarin users, respectively. Furthermore, for HIV infection, the cost of testing had an impact on the ICER results in both abacavir and efavirenz. However, there was no reported uncertainty analysis from a one-way sensitivity analysis among these studies [18, 20, 32, 37, 41, 48, 59, 61, 66, 74].

The transferability of economic evaluation results

Based on Welte et al’s method [15] that assesses the transferability of economic evaluation results across countries, three transferability factors were determined from the economic evaluations for HLAB*5801-allopurinol in gout patients [40,41,42,43,44,45,46,47] as a case study. First, methodological characteristics, e.g., perspective, time horizon, cost categories, and discount rate used, varied across CUA studies. Among six CUA studies, a healthcare payer perspective was the most common, followed by a societal perspective. Nevertheless, lifetime horizon was mostly applied. Cost categories and discount rates used were different. Although three main direct medication costs, e.g., the cost of HLAB*5801 testing, cost of treating ADRs and cost of gout maintenance treatment, were mostly included, the cost of flare management of an acute flare or death was considered in some studies. Based on a societal perspective, direct non-medical cost, e.g., transportation cost and additional food cost for patients and their relatives, and indirect costs, e.g., productivity loss due to illness, were incorporated. Costs and outcomes were discounted at a rate of 3% or 3.5%.

Secondly, the healthcare system characteristics in a particular practice varied among countries. It was reported that China, Taiwan, Korea, UK and US applied febuxostat as an alternative drug in the model based on the recommendations of the American College of Rheumatology that allopurinol and febuxostat were first-line agents for the management of gout [77, 78]. However, febuxostat is not regularly used as an alternative drug in the general clinical practice in Malaysia, Thailand and Singapore. Although the same alternative drug was used, the dosage differed across studies. For instance, allopurinol was used starting at either 100 to 600 mg/day or 100 to 300 mg/day in patients with CKD, febuxostat was used at 40 to 80 mg/day, and probenecid was used at 2 g/day.

Lastly, in terms of population characteristics, disease prevalence was one of the substantial variation factors that could not be transferred from one country to another. The HLA-B*5801 allele frequency and PPV for SJS/TEN were the key drivers influencing cost-effectiveness results. Interestingly, the study in US revealed that genotyping would be cost-effective for Asians and African Americans but not for Caucasians or Hispanics because the HLA-B*5801 frequency was varied substantially across racial or ethnic groups which had an impact on the ICERs [44]. Indeed, the HLA-B*5801 allele frequency ranged from 11.9–18.5% in Asian studies [40, 41, 43, 45,46,47] and was higher than in US and Europe [42, 44], which ranged from 0.7–3.8%.

Furthermore, the PPV in the Asians was higher than in American and European populations. This implied that Asians who carried HLA-B*5801 allele would have more chances to develop SJS/TEN as compared with Americans and Europeans. In summary, regarding the differences in three potential transferability factors across countries, the cost-effectiveness results would be useful for a context specific setting as they may not be directly transferred from one country to another.

Discussion

Our study provided the most updated systematic review on economic evaluation studies of pharmacogenetic testing for prevention of ADRs (59 studies) as compared with two previously published systematic reviews in 2008 (7 studies) and 2016 (47 studies). The majority of included studies were conducted in cardiovascular diseases and mostly found in Europe and US; whereas, only one-third of them were performed in Asian countries. Given the fact that the frequency of each genotyping was different across countries, the cost-effectiveness of pharmacogenetic testing would depend on the ethnicity of patients who were receiving the tests. For instance, HLA-B*15:02 allele is more frequent among Asians than Caucasians, while HLA-A*31:01 is rarer in Asians, but more frequent in Caucasians. Therefore, the regular screening before starting carbamazepine therapy for HLA-B*15:02 in Asians is more useful than HLA-A*31:01 in the context of clinical implementation and future economic evaluation study. Alongside this, pharmacogenetic testing can prevent drug-induced severe ADRs on clinical outcomes and reduce economic burden, which are considered significant impacts involving the interest of policy-makers and healthcare professionals [2, 4]. Compared with previous published reviews, we included additional economic evaluation studies of other pharmacogenetic testing, such as CYP2D6-nortriptyline, CYP2B6-efavirenz, DPYD- fluoropyrimidines and UGT1A1- irinotecan, and statins.

Our review suggested that CUA and CEA were the most common methods for performing the economic evaluation of pharmacogenetic testing. This is consistent with the recommendation by Col NF et al. [79] and Payne K et al. [80], denoting that economic evaluation methods, i.e., CUA or CEA could capture all relevant costs and benefits of pharmacogenetics testing [79, 80]. In addition, our review on cost-effectiveness results of the pharmacogenetic testing for prevention of ADRs showed differences in the parameters, methods and outcomes among included studies. Consequently, this raised concerns on the transferability of the cost-effectiveness analysis results from one country to another, which has been increasingly recognized due to healthcare resource constraints [15].

Notably, our systematic review shed light on the critical appraisal of all included studies to evaluate the quality in terms of reporting and the source of evidence used for important model input parameters, which had significant impact on cost-effectiveness results. Based on the quality appraisal on reporting economic evaluations according to the CHEERS checklist [14], most studies complied with the checklist, except for single study-based economic evaluations. The finding highlighted that 78% of the single study-based economic evaluation studies did not report uncertainty analysis results of the parameters affecting cost-effectiveness results. This may be due to the fact that these reports did not indicate confidence intervals which are necessary measures for performing uncertainty analysis. It should be noted that the advantages of the uncertainty analysis surrounding effects and costs are to provide a correct evaluation of the expected effects and costs, to consider whether existing evidence is sufficient, and to assess the possible consequences of an uncertain decision for decision makers [81]. Therefore, it is worth noting that future studies on the economic evaluation of pharmacogenetic testing with single study-based studies should include uncertainty analysis, since this could significantly lead to the robustness of economic evaluation results. Furthermore, our study revealed that there were studies that failed to report funding (20%) and authors’ disclosure of conflicts of interest (COI) (22%), possibly leading to biased results when making decisions by clinicians, patients and policy-makers, as the authors or funders might have influenced the research findings. Most studies with the omissions of funding sources (67%) and COI (85%) were published between 2002 to 2010, when the reporting of this information had not been mandatory by the journal standards.

In addition, our review highlighted two gaps of knowledge that should be considered for assessing the quality of data sources used for pharmacogenetic testing. First, data sources of clinical effectiveness in several therapeutic areas related to pharmacogenetic testing to prevent drug-induced serious ADRs were very limited, which was consistent with the previous review [11]. Nevertheless, we appraised broader data sources of evidence used not only clinical effectiveness data, but also baseline clinical values, costs as well as resources used and utility data. Our results revealed that there was lack of high-quality evidence, not only estimating the clinical effectiveness of pharmacogenetic testing, but also providing baseline clinical data, according to the hierarchy of evidence developed by Cooper et al. [12]. For example, only 16 studies (27%) obtained clinical effectiveness data of genetic testing from five major RCTs: the PREDICT-1 trial [82] for HLA-B*57:01-abacavir, the TARGET trial [83] for TPMT-azathioprine, ARIES trial [84] and the COUMAGEN trial [85] for CYP2C9 and VKORC1-warfarin, the PLATO trial [86, 87] for CYP2C19-clopidogrel. Yet, of all RCTs, only two RCTs supported pharmacogenetic testing to prevent severe ADRs induced by abacavir and azathioprine. Interestingly, we found that eight studies (14%) obtained clinical effectiveness of testing from the meta-analysis of case-control study with direct comparison, which is not listed in the hierarchy of data sources by Cooper et al. [12].

Second, there were very limited baseline clinical data on pharmacogenetic testing. Our review revealed that only five studies (9%) explicitly analyzed baseline clinical data from reliable databases, including patients from the study setting given that such specific database included patients who developed severe ADRs, which are rare events that might not be commonly available. It should be noted that the quality of sources, especially for clinical effectiveness and baseline clinical data, used to evaluate the economic evaluations of pharmacogenetic testing would be relatively different from that of pharmaceutical interventions. Consequently, this could shed light on a specific ranking system for quality of evidence which is needed for economic evaluation of pharmacogenetic testing to prevent ADRs.

Our study had several strengths. First, pharmacogenetic testing and drug-related ADRs were selected based on the list of currently available clinical guidelines and approved drug labels. Thus, only studies related to treatment options in clinical practice were included to ensure a significant benefit of pharmacogenetic testing and might be useful for clinical decision-making and policy implementation. Second, we added several pharmacogenetic testing from previous studies, such as CYP2D6-nortriptyline, CYP2B6-efavirenz, DPYD-fluoropyrimidines, UGT1A1- irinotecan, and pharmacogenetic testing for statins. Third, we appraised the quality of included studies for both the quality of reporting and data sources of evidence used which had a broader component than the previous studies. This review also described in detail the differences in parameters, methods and economic evaluation results of included studies. Furthermore, we demonstrated a case study to evaluate the transferability of the study results across countries according to potential transferability factors to inform that the economic evaluations of pharmacogenetic testing would be useful for a specific setting or might not been transferred to other clinical settings. The above hypotheses have been supported by the clearly established evidence demonstrating that race/ethnicity and geographic region were possible influencers on the prevalence of HLA-B*5801, in which the prevalence of HLA-B*5801 (< 1%) was found to be lower in Caucasians and Hispanics than that in African Americans (3.8%) and Asians (7.4%) [1, 14].

It is significant to address some limitations in our study. First, we evaluated the quality of data sources for model input parameters from the only existing published criteria for economic evaluation study developed by Cooper et al. and the quality of reporting using the CHEERS checklist guidelines. However, the ranking of data sources may not be specific to the economic evaluations of pharmacogenetic testing. To the best of our knowledge, there have been existing published guidelines of the International Society for Pharmacoeconomics and Outcomes Research relevant to this topic [88,89,90,91,92], in which we did not apply them to our study. It is recommended that those guidelines could be used as criteria in future studies. Second, some studies did not report uncertainty analysis results, which could affect cost-effectiveness results, therefore we could consider only the results of one-way sensitivity analysis obtained from included studies.

Conclusions

This comprehensive review found fifty-nine economic evaluations of pharmacogenetic testing to avoid drug-induced severe ADRs, which mostly focused on therapeutic areas of cardiovascular diseases. CUA and CEA were commonly applied to perform the economic evaluation of pharmacogenetic testing to prevent drug-induced ADRs. Based on the quality appraisal on reporting economic evaluations according to the CHEERS checklist guidelines [14], most studies complied with the guidelines, except that uncertainty analysis of single study-based economic evaluations should be reported. The quality of evidence used in clinical effectiveness data and the baseline clinical data were considered to be low-quality according to the hierarchy of evidence proposed by Cooper et al. Therefore, the criteria for assessing the quality of evidence used for economic evaluation of pharmacogenetic testing of ADRs are needed to be further developed. Differences in parameters, methods and outcomes across studies as well as population-level and system-level differences may lead to the difficulty of comparing cost-effectiveness results across countries. Our findings might be useful for developing future and robust cost-effectiveness analyses of pharmacogenetic testing to inform policy-makers on how to allocate resources effectively and implement such testing into clinical practice.