BACKGROUND

In 2007 the Food and Drug Administration (FDA) issued a labeling change advising physicians to consider the use of “genetic tests to improve their initial estimate” of warfarin dose.1 This is the first FDA recommendation to consider genetic testing when initiating a commonly prescribed medication and may set a precedent for the future use of genetic technologies in clinical practice. Many forces are driving this technology into practice with increasing numbers of companies promoting testing,24 academic institutions racing to be on the cutting edge of clinical medicine, and patients5 interested in the potential of personalized medicine. Indeed, warfarin has several attributes that make it an attractive “target” of personalized medicine: it is commonly prescribed,6 has a narrow therapeutic index with up to 20-fold inter-individual variability in dose-response,7,8 an annual major bleeding risk of 1–5%913 and several common genetic variants that affect warfarin metabolism and activity.1416

Warfarin dose variability is associated with many factors, including age, height, body weight, race, dietary vitamin K intake, intercurrent illness, drug interactions and genetic variation.17,18 Among Caucasians, an estimated 30% to 40% of warfarin dose variability can be attributed to polymorphisms in genes encoding hepatic isoenzyme cytochrome P-450 2C9 (CYP2C9), which is responsible for metabolic clearance of warfarin, and vitamin K epoxide reductase complex subunit 1 (VKORC1), the enzymatic target of warfarin.8,14,15,1928

Allelic variants in CYP2C9 and VKORC1 are common, with more than two thirds of the Caucasian population and up to 90% of East Asians manifesting at least one variant.1416 Affected individuals require, on average, lower doses of warfarin to maintain a therapeutic INR and more time to achieve stable dosing.15,19,21,23,2931 Carriers of variant alleles are at higher risk for bleeding complications, particularly at the induction of warfarin therapy,3237 and genotype-guided dosing algorithms better approximate maintenance warfarin dose than fixed-dose algorithms.15,32,3843 However, a recent analysis by Eckman and colleagues concluded that genotype-guided dosing was unlikely to be cost-effective in nonvalvular atrial fibrillation patients.44 Furthermore, it remains unclear whether pharmacogenetic dosing will reduce the incidence of serious bleeding or over-anticoagulation compared to current methods of initiating and dose-adjusting warfarin.

OBJECTIVES

In order to summarize the current evidence supporting the use of warfarin pharmacogenetics, we performed a systematic review of randomized trials that compared a dose-selection strategy that used pharmacogenetic information to one that did not.

METHODS

Data Sources

We searched PubMed, EMBASE and the International Pharmaceutical Abstracts through January 23, 2009. The complete search strategy is described in the online Appendix 1. In order to identify ongoing clinical trials, we searched http://www.clinicaltrials.gov on February 19, 2009 (online Appendix 1). We examined the reference lists of included articles and professional reviews, and contacted experts to identify other potentially relevant studies.

We included randomized controlled trials that compared clinical outcomes among a pharmacogenetic dosing group, using common genetic variants of CYP2C9 and/or VKORC1, to a dosing algorithm that did not incorporate genetic testing. Eligible studies enrolled adult, warfarin-naïve patients with any indication for warfarin therapy, including atrial fibrillation, venous thromboembolic disease, recent orthopedic surgery and valvular disease.

Data Extraction

Two reviewers (KK, JT) independently extracted data using a data abstraction instrument: number of participants in each arm, study quality, length of follow-up period, intervention and control dosing algorithms, and primary and secondary outcomes. Our primary outcomes of interest were the incidence of major bleeding events and time spent in therapeutic range (INR between 2–3) as calculated by the linear interpolation method,45 an accepted measure of anticoagulation quality and a potential surrogate for bleeding risk.9,11 Our secondary outcomes of interest included incidence of minor bleeding and thromboembolism, and measures of over-anticoagulation and inadequate anti-coagulation, including: time to first therapeutic INR, time to stable warfarin dose, percentage time INR greater than 3 or 4, and relative number of INR blood draws. We resolved all disagreements by discussion and consensus. We contacted the authors of the studies for additional information.

We used standard definitions for several outcomes. Major bleeding was categorized according to an international consensus statement46 as any of the following: fatal bleeding, symptomatic bleeding in a critical area or organ, a fall of hemoglobin greater than or equal to 2 mg/dl, or requirement of transfusion of two or more units of red cells or whole blood. For one study,47 we categorized gastrointestinal bleeds as major bleeds because we were unable to ascertain the severity of events reported. Percentage time in therapeutic range INR was defined between 2 and 3 unless otherwise stated. Time to stable warfarin dose was defined as two consecutive, therapeutic INR values separated by at least 7 days without intervening dose alteration.48

Study quality was assessed using the Jadad scale.49 The score ranged from 0 to 5 with higher scores indicating higher quality.Footnote 1 We further characterized study quality with assessment of: allocation concealment, comparability of baseline groups, equivalency of loss to follow-up, similarity of co-interventions and whether the analysis was intention to treat.

Data Synthesis and Statistical Analysis

Quantitative data synthesis was performed using STATA version 10 (STATA Corp, College Station, TX). Meta-analysis was performed on primary outcomes. The principal measures of effect size between the intervention and control arms were risk ratio for major bleeding, and the standardized mean difference (SMD) for percentage time INR within therapeutic range (online Appendix 2). In one case,48 we used a weighted average of the percentage time within therapeutic range from both the initiation (first 8 days) and stabilization period (day 9 through stable dose), because only separate estimates were reported in the study. We used a random effects model50 to combine results across studies when appropriate and assessed heterogeneity with the Q statistic51 and I2.5254 In order to test for publication bias, we used the Begg adjusted rank correlation test and the Egger regression asymmetry test.55,56 Pre-specified subgroups for sensitivity analyses included the quality of the trials (poor versus good/excellent) and the length of follow-up (< 30 days versus ≥ 30). Sensitivity analysis by genes used in the prediction model was not performed because there were too few studies for it to be meaningful.

RESULTS

Study Characteristics

Our search identified 2,014 unique studies. Three studies satisfied all inclusion and no exclusion criteria (Fig. 1). All three studies were small, single-center randomized clinical trials ranging from 38 to 238 patients (Table 1). Follow-up ranged from 22 days in the pharmacogenetic arm of the study by Caraco et al. 48 to an average of 46 days across both arms of the study by Anderson et al.57 Patients were almost exclusively Caucasian (97%) older adults taking warfarin for the first time. Demographic characteristics were similar between studies (Table 2), but the treatment setting varied from predominantly outpatient to largely inpatient. Principal indications for warfarin initiation were atrial fibrillation, atrial flutter, deep venous thrombosis and pulmonary embolism in these three studies. Hillman et al.47 also included prosthetic valve and joint patients, and Anderson et al.57 included preoperative orthopedic patients. In all studies, observed genotype frequencies were in accordance with Hardy-Weinberg equilibrium.

Figure 1
figure 1

Flow diagram of process to select studies for inclusion: search January 23, 2009. *General pharmacogenetic reviews, warfarin drug interactions, other drugs metabolized by cytochrome P450 enzymes. †Case report N = 39, case control N = 15, cohort studies N = 171. ‡References. 59,62,63

Table 1 Study Characteristics in the Randomized Trials of Genotype-Guided Warfarin Dosing
Table 2 Patient Characteristics in the Randomized Trials of Genotype-Guided Warfarin Dosing

Each of the three studies used different dosing models for their pharmacogenetic and control dosing arms, which is reflected in the differing average initial doses across the three studies in both groups (Table 1). For the pharmacogenetic arm, the studies by Hillman et al.47 and Caraco et al.48 used dosing models that accounted only for CYP2C9 variants, while Anderson et al.57 incorporated both CYP2C9 and VKORC1 variants. Two of the pharmacogenetic algorithms4757 were previously validated and adjusted for covariates of age, sex and weight. In contrast, Caraco et al.48 created a new algorithm that estimated warfarin dose based only on CYP2C9 genotype and amiodarone use.

All three studies evaluated outcomes of bleeding and time within therapeutic range. No study reported active surveillance for clinical adverse events of bleeding or venous or arterial thromboembolism.

Study Quality

Study quality varied substantially (Table 3). Caraco et al.48 received the lowest Jadad score (1) for inadequate randomization and blinding. Patients in this study were randomized by the “even” or “odd” last digit of their identity number, and investigators were not blinded after day 8 of follow-up. The intention to treat principle was also violated: 51 excluded patients had initiated warfarin therapy, but were not included in the analyses, and there was no data comparing treatment groups at randomization. The authors followed the control arm, on average, for almost twice as long as the pharmacogenetic group (Table 1). However, time-dependent outcomes such as number of bleeding events, percent time in therapeutic range, time spent with out-of-range INR and total number of INR draws were not adjusted to account for different lengths of follow-up time.

Table 3 Quality Assessment of the Randomized Trials of Genotype-Guided Warfarin Dosing

The two remaining studies were of good quality overall. Hillman et al.47 received a Jadad score of 3 due to single-blinded design. Anderson et al.57 received the highest Jadad score (5). However, despite adequate randomization in the Anderson study,57 there was a significantly higher percentage of patients with ≥1 variant allele in CYP2C9 or VKORC1 among the control group compared to the pharmacogenetics group (p < 0.01, Table 2).

Primary Outcomes

Primary outcomes of interest were rates of major bleeding and percentage time INR in the therapeutic range (Table 4). None of the individual studies was powered to show a difference in major bleeding (Fig. 2), and the pooled risk ratio of 0.69 did not achieve statistical significance (95% CI 0.16 to 2.9). While there was no evidence of statistical heterogeneity (chi-squared p = 0.45, I2 = 0.0%), there was clinically important variability in the pharmacogenetic and control interventions, length of follow-up and study quality. Egger’s regression did not show evidence of publication bias (p = 0.79); however, this estimate is limited in the presence of only three studies (Online Appendix 3 shows estimates used for meta-analyses).

Figure 2
figure 2

Forest plot. Meta-analysis of the risk ratio of major bleeding between pharmacogenetic dosing and the control group. This shows a pooled risk ratio of 0.69 favoring pharmacogenetics dosing, though it does not meet statistical significance (95% CI 0.16 to 2.9).

Table 4 Primary and Secondary Outcomes in the Randomized Trials of Genotype-Guided Warfarin Dosing

Figure 3 summarizes the differences in the percentage time in therapeutic range across the three studies. No pooled estimate is presented because there was significant heterogeneity (chi-squared p = 0.03, I2 = 72.5%). This outcome showed a strong beneficial effect in the Caraco study,48 but no effect in the other two.47,57 If the poor quality study is excluded, there is no longer significant heterogeneity (chi-squared p = 0.91, I2 = 0.0%), and the pooled estimate is not significant (p = 0.76). Notably, time within therapeutic range is as variable across the three studies as it is between intervention groups in the Caraco study48 (Table 4). Other sources of heterogeneity include differences in treatment algorithms and study quality, differing frequency of INR measurements and differing lengths of follow-up. Anderson et al.57 used a more lenient definition of therapeutic INR (1.8 to 3.2), which may also contribute to heterogeneity. Sensitivity analysis using follow-up time of 30 days and a standard definition of therapeutic INR (2 to 3) yielded a similar result (not shown). Heterogeneity remained borderline significant (p = 0.1, I2 = 57.4%), suggesting that the variability in study results was due to factors other than differences in follow-up time or INR.

Figure 3
figure 3

Forest plot. Meta-analysis of average percentage time spent in the therapeutic range. The SMD is the difference in time spent in therapeutic range as a proportion of the standard deviation around the average value for the entire group. Here, no summary estimate is shown due to significant heterogeneity (chi-squared p = 0.03, I2 = 72.5%).

Secondary Outcomes

Overall, there were few consistent trends showing a difference in secondary outcomes between the groups. Caraco et al.48 reported an advantage for pharmacogenetic dosing compared to the control group for several outcomes (Table 4), including lower cumulative incidence of minor bleeding, decreased time to first therapeutic INR, decreased time to stable warfarin dose, decreased total number of INR draws and fewer days of INR >3 (1.77 vs. 6.58, p < 0.001). However, the longer follow-up time in the control arm complicates the interpretation of time-dependent results, including the number of bleeding events, total number of INR draws and days of supratherapeutic INR. It is noteworthy that the Hillman47 and Anderson57 studies did not replicate these findings. No study showed a difference in thromboembolism incidence.

The pharmacogenetic dosing groups showed improvement in time to stable warfarin dose compared to the control groups (Table 4) in two of the three studies48,57 and was not reported in the third.47 Among the studies reporting this outcome, the pharmacogenetic arm was favored with statistically significant results by Caraco et al.48 and near statistical significance by Anderson and colleagues (14.1 versus 19.6 days, p = 0.07).57 The longer follow-up in the control arm of the Caraco48 study did not affect this result.

Ongoing Clinical Trials

We identified at least five ongoing randomized clinical trials comparing pharmacogenetic dosing of warfarin therapy to a non-genetic control algorithm (Table 5). Notably, the National Heart, Lung and Blood Institute (NHLBI) is sponsoring a large (N = 1,238), multi-center, double-blinded, randomized trial comparing a recently validated38,58 clinical plus genotype-guided algorithm to a clinical only-guided dosing algorithm. In all, randomized control experience of pharmacogenetic dosing will encompass data collected from more than 2,500 patients.

Table 5 Ongoing Randomized Trials Comparing Pharmacogenetic Warfarin Dosing to a Control-Dosing Algorithm Among Patients Starting Warfarin Therapy

DISCUSSION

Our study found little randomized trial data available to support the hypothesis that pharmacogenetic dosing at the onset of warfarin therapy reduces major bleeding events. An extensive search yielded only three small randomized trials evaluating pharmacogenetic dosing, and among these, there was significant variability in terms of design quality, length of follow-up, intervention and outcome measures. No study had adequate power to evaluate differences in major bleeding rates between groups. In the pooled estimates, there was a trend towards less bleeding with pharmacogenetic dosing, but this should be interpreted with caution because of the differences in design between studies. Percentage time within therapeutic range varied significantly across the studies even with standardized INR range and more uniform follow-up time. This disparity raises concern that methods of ascertainment of this outcome are likely to have differed between studies. There was some evidence that time to stable warfarin dose may be decreased with genotype-guided dosing.

The study by Anderson el al.57 is the highest quality trial published to date, and the only study that incorporated both VKORC1 and CYP2C9. It is notable that there were more variant alleles in the standard dosing arm compared to pharmacogenetic arm of this study.57 Because patients with variant alleles are known to be more likely to have out of range INR and bleeding complications, this difference could have biased the results in favor of the pharmacogenetic arm. Indeed, some outcome estimates in this trial favored pharmacogenetic dosing, but none achieved statistical significance.

Only the Caraco study48 showed statistically significant improvement in nearly all surrogate outcomes with pharmacogenetic dosing. However, the lack of true randomization and allocation concealment, the high loss to follow-up, the lack of intention to treat analysis and the different lengths of follow-up between groups challenge the internal validity of these results. Specifically, the outcomes of total number of bleeding events, percentage time INR in therapeutic range, days of supratherapeutic INR and total number of INR draws are invalidated on the basis of detection bias as a result of the nearly two-fold increased follow-up time for the control group. As an example, the total number of INR draws was 36% higher in the control group, but the average interval between consecutive INR draws was the same between groups.

Both the Hillman47 and Anderson57 studies used a multivariable algorithm to select the initial dose for the patients in the intervention arm, taking into account not only the contribution of genetic variation, but also other well-established factors that are known to affect overall warfarin dose such as age, sex and weight. In contrast, patients in the control arms of these two studies47,57 all received the same initial dose. Despite this seemingly unfair advantage at the outset, neither of these studies demonstrated statistically significant improvement of outcomes for the pharmacogenetic arm.

Is pharmacogenetic dosing of warfarin more safe and effective than a one-size-fits all strategy followed by careful INR monitoring? The results of our study demonstrate that we still do not know. An uncontrolled study 59 evaluating a CYP2C9 dosing algorithm 40 in patients initiating warfarin further highlights this uncertainty. Although the algorithm estimated the maintenance warfarin dose well (R2 = 0.42, p < 0.001), carriers of variant CYP2C9 alleles continued to have a significantly increased risk of INR >4 (HR 4.6, p < 0.01) compared to those with the wild-type allele. There is evidence that the greatest risk of warfarin-induced adverse events is at the induction of warfarin therapy,60 and that INR levels prior to day 4 of therapy do not predict dose response differences. Thus, the traditional “trial and error” method may result in delays in estimating the appropriate dose.41 However, Li and colleagues61 recently found that CYP2C9 and VKORC1 genotypes did not add to early INR response as a predictor of warfarin sensitivity. Even if pharmacogenetic dosing does not reduce major bleeding, it may still be useful and cost effective if it results in shorter time to stable dose, and fewer blood draws to attain stable INR. It is possible, however, that physicians may become more complacent with pharmacogenetic dosing resulting in reduced surveillance and a paradoxical increase in bleeding during the initiation of warfarin therapy.

Our study has limitations. First, very little high-quality evidence has been published in this area: we identified only three small randomized trials evaluating pharmacogenetic dosing of warfarin. Second, important differences in designs, outcome definitions and follow-up intervals used by these three trials reduced the degree to which we could pool their individual findings. We did not perform meta-analysis of secondary outcomes because of significant heterogeneity of the trials. Third, we did not evaluate genotype-specific outcomes, because these are not relevant when providers are unaware of genotypes in advance. Lastly, we did not include the three prospective cohort studies using genotype-guided algorithms,59,62,63 because others have reviewed this literature,64,65 and we decided a priori that only randomized trials could reliably demonstrate whether pharmacogenetic dosing improves patient outcomes. It may be considered early to perform a systematic review on a topic where so few randomized controlled trials are available; however, given the FDA relabeling, we feel it is important to evaluate the current evidence.

The package insert of warfarin advises that “lower initiation doses should be considered for patients with certain genetic variations in CYP2C9 and VKORC1 enzymes.” This FDA labeling change was made on the basis of accumulation of data66 demonstrating that allelic variants in CYP2C9 and VKORC1 are associated with increased plasma warfarin levels, out of range INR and increased bleeding risk.15,19,23,35,67,68 However, as our study demonstrates, there is no evidence that a more accurate initiation dose reduces the risk of bleeding. Results from ongoing clinical trials will help to clarify the role of genetic testing in warfarin management. A target enrollment of at least 2,000 patients has been suggested,57 and currently the cumulative experience of >2,500 patients is anticipated.

Each of the randomized trials reviewed in our study used different pharmacogenetic and control group dosing algorithms. The most comprehensive and widely available pharmacogenomic algorithm, http://www.WarfarinDosing.org, has been recently validated by the The International Warfarin Pharmacogenetics Consortium and will be used in the largest randomized trial sponsored by the NHLBI.38 Until recently, however, there was no widely accepted pharmacogenetic algorithm to guide the initiation of warfarin therapy, and new models are still being developed and validated.69 Although warfarin dosing algorithms do not eliminate the need for frequent INR monitoring and dose titration, these algorithms can, even in the absence of genotype information, provide a very good estimate of the patient’s warfarin dose by taking into account readily available information such as age, gender, weight and smoking status.38 Whether these algorithms improve outcomes compared to other warfarin initiation strategies is not known.

The products of genetic discovery are becoming increasingly relevant to the practice of clinical medicine, particularly in the realm of pharmacogenetics. Genotype-guided warfarin prescribing is currently the focus of much attention and is positioned to set a precedent for how integration of genetic technologies in clinical practice will proceed. In the case of warfarin, it seems intuitive that adjusting warfarin dose to match patients’ genetic makeup will result in fewer complications; however, our review, along with at least one unfavorable cost-effectiveness analysis,44 demonstrates that additional clinical trial data are needed prior to endorsing a new standard of care for warfarin dosing.

CONCLUSION

In conclusion, our study did not find sufficient evidence to support the use of pharmacogenetics to guide warfarin therapy outside of clinical trials at this time. Small sample sizes and heterogeneity across the few available studies precluded definitive estimates of the relative effectiveness of this intervention. We recommend that policy makers and clinicians await the results of larger, high quality randomized trials and better cost-effectiveness analyses before adopting genetic testing as the standard of care for warfarin initiation.