Background

Atrial fibrillation (AF) is the most common arrhythmia [1]. It is accompanied by an increased risk of thromboembolic stroke [1, 2]. For most patients with AF, oral anticoagulation (OAC) is thus recommended for stroke prevention [3]. Vitamin-K-antagonists (VKA) have been the standard substances for OAC for a long time. Starting with the drug approval of the first direct oral anticoagulant (DOAC) dabigatran etexilate [4], a trend towards prescribing DOACs instead of VKA gained momentum. At this current time four different DOACs are approved in Germany: dabigatran, rivaroxaban, apixaban, and edoxaban. The pivotal randomized controlled trials (RCT) partially showed statistically significant but small risk reductions or at least non-inferiority regarding the outcomes stroke/systemic embolism, and major bleeding as compared to warfarin [5,6,7,8].

For patients with a higher risk of bleeding due to patient-specific criteria like severely impaired renal function, old age, or reduced body-weight, the European Medicines Agency (EMA) and the Drug Commission of the German Medical Association (AkdÄ) recommend DOACs to be prescribed in lower dose [3, 9,10,11,12]. Low-dose DOAC (ld-DOAC) therapy is common internationally. In the ORBIT-AF II registry (USA), 16% of patients with DOACs received a reduced dose [13]. Higher rates of ld-DOACs were reported in Denmark and Germany, ranging from 32 to 52% [14,15,16,17]. In a Japanese single-centre cohort study, the ld-DOAC cohort included as many as 56% of patients [18]. Only for dabigatran and edoxaban effectiveness and safety of ld-DOAC therapy was compared to warfarin in pivotal RCTs [6, 7]. A reduced dose seemed to be partially associated with a higher risk for thromboembolic events and a lower risk for bleeding. In a cohort study from Denmark comparing ld-DOACs with warfarin, event rates of ischemic stroke or systemic embolism did not differ [19]. Bleeding rate was significantly lower only with dabigatran. A real-world study by Hohnloser et al. revealed that the risk for ischemic stroke with ld-DOACs was similar to phenprocoumon but the bleeding risk partially decreased with ld-DOACs [14]. Although in Germany phenprocoumon is used almost exclusively as VKA, to our knowledge, there are no RCTs comparing phenprocoumon and DOACs in the general population of patients with AF. Phenprocoumon differs in its pharmacokinetic properties as compared to warfarin, as it has for example a longer half-life [20]. Studies have shown that time in therapeutic range (TTR) in Germany with phenprocoumon is better than in the RCTs comparing warfarin to DOACs elsewhere [5,6,7,8, 21, 22]. Therefore, transferring the results of the comparison of DOACs or even ld-DOACs with warfarin to phenprocoumon does not seem appropriate.

The aim of this study was to add to the current evidence - with ambiguous results - by comparing the effectiveness and safety of OAC for patients with AF treated with ld-DOACs as opposed to phenprocoumon in a real-life setting. To our knowledge, this is the first empirical study comparing phenprocoumon with all four DOACs approved in Germany (apixaban, edoxaban, dabigatran, rivaroxaban) in reduced dosage.

Method

A retrospective observational cohort study using German claims data of several company health insurance funds was conducted. Routine health care data were provided and analysed by the Corporation for Efficiency and Quality in Health Insurance (GWQ ServicePlus AG, Gesellschaft für Wirtschaftlichkeit und Qualität bei Krankenkassen: FK, BD). It is owned by a group of health insurance companies comprising up to 10.5 million insurants in Germany. The reporting of the study is based on the German GPS (Good Practice Secondary Data Analysis) [23] and the RECORD (Reporting of studies conducted using observational routinely-collected health data) statement [24].

Data and study population

The dataset included information from outpatient and inpatient care (age, sex, diagnoses, and medications). Data from the years 2014 to 2019 were analysed. Claims data that could not be linked to patients due to bad coding was corrected as far as possible with an internal mapping algorithm. To achieve a dataset of patients with the possibility of at least one year of follow-up with continuous insurance status, patients with a first prescription of OAC in 2015 to 2018, defined as index date, were included. Furthermore, patients had to have at least one in- or outpatient diagnosis of AF and no OAC prescription during the pre-index period of 12 months, and at least 12 months of follow-up time after the index date, to be included. In case of death during the observation period there was no minimum follow-up time. For the survivors, continuous insurance status was defined as being insured in the beginning and end of the observation period and having at least one observable insurance day in each observable quarter. Exclusion criteria were receiving more than one oral anticoagulant, receiving DOACs in both low and standard dose or receiving warfarin as VKA on index date. Other VKAs were not prescribed. Datasets with an undefined age and/or sex, age younger than 18 years, and dialysis were also excluded. Patients with pulmonary embolism and/or deep vein thrombosis during the pre-index period as competing indication for OAC were excluded from the sample. Data of patients were selected as shown in Fig. 1.

Fig. 1
figure 1

Data selection process. Index date is defined as the date of first prescription of an oral anticoagulant

From the DOAC sample only patients with ld-DOACs were considered. Ld-DOAC treatment was defined as a dose smaller than the standard dose as suggested for prevention of thromboembolism in AF by the summaries of product characteristics (SmPCs) of the respective DOAC (standard dose: apixaban: 2 × 5 mg/day, edoxaban: 60 mg/day, dabigatran 2 × 150 mg/day, rivaroxaban 20 mg/day) [9,10,11,12]. Dosing was operated by the pharmacy-central-number (PZN = identification number for pharmaceutical products in Germany) (Table S1, additional file 1) [3].

Outcome measures

In line with the other real-world studies, the observation period was chosen to be 12 months beginning with the date of first prescription [14, 15]. Effectiveness outcomes were hospitalization due to thromboembolic events, including ischemic stroke, non-specified stroke, transient ischemic attack, and mesenteric ischemia. Another outcome was death of any cause (death coded as reason for deregistration from health insurance). Safety outcomes were major bleedings defined as hospitalizations due to bleeding in critical areas or organs, like intracranial bleeding, and other bleedings which led to blood transfusion. The choice was made according to the criteria of the International Society on Thrombosis and Haemostasis (ISTH) (ICD-10-codes in Table S2, additional file 1) [25].

Statistical analysis

Comparisons were made between phenprocoumon and five ld-DOAC groups (ld-apixaban, ld-dabigatran, ld-edoxaban, ld-rivaroxaban and the composite of all ld-DOACs). Calculating event rates, only the first event per patient was considered for each outcome. Rates were calculated per 100 patient years. Event rates and Cox regression were censored for death and switch in medication and/or dose.

Cox regression models were applied to estimate effectiveness and safety of treatments with adjusted cause specific hazard ratios. Risk adjustment was done based on the following pre-treatment control variables to reduce confounding: (1) age and sex; (2) comorbidities, e.g. arterial hypertension, cachexia, and renal impairment; (3) comedication, e.g. antiarrhythmic, antihypertensive medication (ICD-10-codes/ATC-codes in Table S3, additional file 1); (4) CHA2DS2-VASc-Scores, calculated based on the dataset (sex not included, as it is considered separately); (5) Charlson Comorbidity Index (CCI, [26]); (6) effectiveness and safety outcomes that occurred before the index date; (7) dummy variables for each year/quarter of the index dates. As the CHA2DS2-VASc -Score has similar predictive performance as the widely used HASBLED and other predictive scores, we refrained from adjusting to another bleeding risk score [27]. Multicollinearity tests were applied to test whether the treatment effect could be properly distinguished from the confounders.

A sensitivity analysis was carried out using propensity score matching (PSM). Propensity scores were estimated using the same explanatory variables as in the Cox regressions. A 1:1 nearest neighbour matching without replacement was performed under the constraint that maximum standardized mean difference between the groups had to be < 0.1 for all confounders. Year and time dummies were excluded when estimating the propensity scores in order to achieve a comparable treatment and control group, as the composition of VKA/DOACs changed over time. Cohort-pairs (n = 5) were formed for all ld-DOAC groups using logistic regression with ld-DOAC patients used as binary dependent variables in the first stage of the matching process. Sample sizes after PSM, number of patients for whom no matching partner was found, and baseline characteristics after PSM are depicted in Table S4, additional file 1.

After PSM, a two-sample test for equality of proportions with continuity correction was performed for the outcomes thromboembolic events, death, and bleeding.

To counteract the problem of multiple testing p-values were adjusted with Bonferroni correction (n = 15 tests) separately for Cox regression and sensitivity analysis. An adjusted two-sided p-value < .003 was considered significant. Statistical analyses were performed using R Statistical Software (version 3.6.1) [28,29,30].

Results

Baseline characteristics and unadjusted outcome rates

In total, 73,506 patients received DOACs. Of the latter 21,724 (29.6%) received ld-DOACs. After excluding patients with DOAC in standard dose (n = 51,782), data from 41,903 patients were analysed, of which 20,179 received phenprocoumon. The baseline characteristics are reported as proportion or mean with standard deviation in Table 1. Patients in the ld-DOAC cohort were more likely to be female, were older, had more comorbidities, and a higher CHA2DS2-VASc-Score.

Table 1 Baseline characteristics of phenprocoumon, composite low-dose DOAC (ld-DOAC) cohort and single ld-DOAC cohorts

Observed crude event rates before matching indicated a higher risk for all outcomes for the composite ld-DOAC cohort. This was consistent regarding the single ld-DOAC subgroup analyses, with exception to bleeding in patients taking dabigatran (event rates per 100 patient-years: phenprocoumon = 4.05, ld-apixaban = 4.26, ld-dabigatran = 3.52, ld-edoxaban = 4.84, ld-rivaroxaban = 5.48) (Table 2). The mean follow-up times are depicted in Table S5, additional file 1.

Table 2 Event rates per 100 patient-years (py) for the outcomes thromboembolic events, death and bleeding before and after propensity-score matching

Cox regression analysis

Safety and effectiveness for the composite ld-DOAC cohort

Results showed statistically significant fewer thromboembolic events and deaths with phenprocoumon as compared to ld-DOACs. There was a non-significant trend of fewer bleedings with composite ld-DOACs (thromboembolic events: HR = 1.29, 95% CI [1.13, 1.48], p < .001; death: HR = 1.52, 95% CI [1.41, 1.63], p < .001; bleeding: HR = 0.89, 95% CI [0.79, 1.00], p = .051).

Safety and effectiveness in single ld-DOAC subgroups

Regarding the single ld-DOAC subgroups, the effect of fewer thromboembolic events with phenprocoumon was only statistically significant in the ld-apixaban cohort. In the other cohorts, risks did not differ significantly from phenprocoumon (phenprocoumon vs. ld-apixaban: HR = 1.42, 95% CI [1.21, 1.65], p < .001).

All subgroup cohorts except ld-dabigatran were associated with a higher risk of death than phenprocoumon (ld-apixaban: HR = 1.63, 95% CI [1.50, 1.76], p < .001; ld-dabigatran: HR = 1.12, 95% CI [0.94, 1.34], p = .193; ld-edoxaban: HR = 1.40, 95% CI [1.22, 1.60], p < .001; ld-rivaroxaban: HR = 1.45, 95% CI [1.32, 1.59], p < .001).

A statistically significant lower bleeding risk was shown only for the ld-apixaban cohort. In the ld-dabigatran and ld-edoxaban cohorts a slight tendency towards a lower bleeding risk was shown. For the ld-rivaroxaban cohort the association was reversed (ld-apixaban: HR = 0.75, 95% CI [0.65, 0.86], p < .001; ld-dabigatran: HR = 0.86, 95% CI [0.64, 1.14], p = .298, ld-edoxaban: HR = 0.95, 95% CI [0.75, 1.21], p = .700; ld-rivaroxaban: HR = 1.11, 95% CI [0.96, 1.29], p = .155). The results are depicted in Fig. 2. Hazard ratios for all covariates are depicted in Fig. S1, additional file 1.

Fig. 2
figure 2

Cox proportional hazard regression model for the comparison of low-dose DOAC versus phenprocoumon. Adjusted hazard ratios with 95%-confidence interval and p-value adjusted with Bonferroni correction (adjusted p-value < .003, n = 15)

Sensitivity analysis

Results of the comparison of phenprocoumon and ld-DOACs regarding effectiveness and safety with Cox regression models showed consistency in analysis after PSM (Table 3). The outcome death was associated with the highest absolute risk increases with ld-DOAC subgroups compared to phenprocoumon ranging from 2.8% in the ld-rivaroxaban cohort to 5.7% in the ld-apixaban cohort. Regarding the outcome major bleeding, ld-apixaban was associated with a statistically significant absolute risk reduction (ARR = 1.2, 95% CI [0.6, 1.8%]). The event rates per 100 patient-years after matching are shown in Table 2.

Table 3 Results of analysis of effectiveness and safety of low-dose DOAC (ld-DOAC) versus phenprocoumon

Discussion

The analysis of routine health-care data revealed a small but statistically significant higher risk for thromboembolic events and death for patients with ld-DOAC as compared to phenprocoumon. A non-significant association towards a lower severe bleeding risk in patients with ld-DOAC compared to phenprocoumon could be seen. Regarding the single ld-DOAC subgroups, ld-apixaban has on the one hand a small disadvantage in effectiveness and on the other hand a small advantage concerning major bleeding. All ld-DOACs but ld-dabigatran were associated with a significantly higher risk of death.

Severe renal impairment is one of the main indications for dose reduction of DOACs [3]. The rate of ld-DOAC patients with renal impairment in our study was less than 50%. Although quality of coding renal impairment in claims data is known to be low, this still might indicate that a noticeable part could be underdosed with ld-DOAC. Previous studies showed that many patients (22 to 57%) with ld-DOAC do not meet the criteria for reduced dose [13, 16, 18, 31]. Assumingly, prescribing a lower dose than recommended might be induced by the intention to protect the patient, as inadequate ld-DOAC is often prescribed in elderly patients at a higher risk for bleeding and stroke [16, 18, 31, 32]. In the case of VKA therapy, patients at risk would possibly receive a close INR-monitoring to prevent overdosing or a dose targeted at the lower border of INR, as low-intensity VKA-therapy was shown to be as efficient and safer as standard VKA-therapy in a recent meta-analysis [33]. However, a routine monitoring for DOAC therapy is not established. Triggered by caution and uncertainty this could lead to an inappropriate dose reduction in patients with an assumed higher bleeding risk. Previous studies demonstrated that this could harm the patient: A higher all-cause mortality and a similar risk for stroke or systemic embolism with inadequate compared to adequate DOAC dosing [31] and a 2.5-fold increase of risk for thromboembolic events compared to VKA were recently reported [34]. Hence, ld-DOAC should be prescribed according to the recommendations and guidelines.

Inadequate and non-recommended prescribing of ld-DOACs could be a factor that influenced our results. In our analysis we combined patients with adequate and inadequate ld-DOAC therapy. Combined, the only statistically significant beneficial effect of ld-DOAC was seen for bleeding in patients with ld-apixaban. Thus, phenprocoumon seems to be superior as compared to ld-DOACs as prescribed today in Germany. However, the results might be different if ld-DOACs were prescribed only for patients meeting the guideline criteria. As laboratory parameters of for example renal function and information on patients’ weight are not part of our data, we could not but conflate data of patients with adequate and inadequate ld-DOACs.

Comparison of the results to the existing literature

To our knowledge, there is no RCT comparing efficacy and safety of DOACs and phenprocoumon in patients with AF except for a few studies with focus on special comorbidities like end-stage kidney disease or special situations like catheter ablation [35,36,37,38,39]. For most of them the results are not yet published. The major RCTs compared DOACs with warfarin. To conduct an appropriate comparison of our results to the existing literature, we first focus on two other real-world studies comparing ld-DOACs with phenprocoumon before we discuss why the results of our study might differ from those of RCTs conducted with warfarin.

Comparison to other real-world studies with ld-DOACs and phenprocoumon

Effectiveness

As in our study, a significantly higher risk for thromboembolic events in patients with ld-DOACs than with phenprocoumon was also shown by Mueller et al. [15]. In the study of Hohnloser et al. at least ld-apixaban was associated with an even lower risk for the composite outcome stroke/systemic embolism [14]. However, in terms of ischemic stroke, the differences between the single ld-DOACs and phenprocoumon were not statistically significant. In contrast to our study and the one by Mueller et al., Hohnloser et al.’s study included haemorrhagic stroke to the composite outcome stroke and systemic embolism.

Bleeding risk

Contrary to our results a benefit of phenprocoumon over composite ld-DOACs regarding severe bleeding risk was shown by Mueller et al. [15]. Their composite DOAC cohort, however, comprised mainly patients with rivaroxaban. Hohnloser et al. reported a beneficial effect of ld-apixaban, also shown by our results [14]. In contrast to our results they also found a lower major bleeding risk with ld-dabigatran; however, in terms of any bleeding, intracranial bleeding, and gastrointestinal bleeding no significant differences were shown. The risk of intracranial bleeding was lower than the risk of gastrointestinal bleeding with ld-apixaban, ld-dabigatran, and ld-rivaroxaban. Our study cannot validate this effect as we did not differentiate between different types of bleeding. A possible explanation for the lower bleeding risk of apixaban in comparison to the other DOACs might be found in pharmacokinetics [40, 41]. Apixaban is the only factor Xa inhibitor that is administered twice daily. However, the exact reason is not known.

Risk of death

In line with our results, Hohnloser et al. found a higher risk of death with ld-rivaroxaban and ld-apixaban, although the trend for ld-apixaban was not statistically significant [14]. Ld-edoxaban was only analysed in our study and was associated with a higher risk of death as well. The cause of seemingly increased risk of death for patients with ld-DOAC remains uncertain in our data. Residual confounding leading to the seeming appearance of an association between risk of death and ld-DOAC therapy has to be taken into consideration as a partial explanation.

Why the results of our study might differ from those of RCTs conducted with warfarin

Only in the pivotal RCTs for dabigatran and edoxaban DOACs were analysed in reduced dosage separately [6, 7]. Those results partially differed from our results. Specific methodological characteristics of RCTs and real-world studies might trigger discrepancies in results. In studies using routine health-care data, large and unselected populations are represented in comparison to strict selection criteria for study populations in RCTs. Patients in RCTs are often younger, have less comorbidities, and show higher treatment adherence [42]. To investigate effectiveness and safety of new drugs, real-world-studies represent an important complement to RCTs [43]. In addition to methodological differences, differences of the results between our study and the RCTs might be influenced by comparing phenprocoumon to DOACs instead of warfarin. As mentioned above, phenprocoumon differs in pharmacological properties and was associated with a better TTR in previous studies as compared to warfarin. Additionally, there are great differences between the anticoagulation management in Germany and other European countries. Le Heuzey et al. compared anticoagulation management in five European countries. Two factors are distinct for Germany: Phenprocoumon is only predominant in Germany and in contrast to the other countries, INR measurements in Germany are mainly performed in physicians’ offices and as self-management, while in other countries those are also performed in hospitals, anticoagulation centres, and laboratories [22]. It is possible that both factors affect the TTR. In contrast to our results favouring phenprocoumon, Nielsen et al. did not find statistically significant differences regarding thromboembolic embolism in a real-world-study with ld-DOACs and warfarin [19]. Regarding overall bleeding risk, they showed a small benefit only for dabigatran, associated with an advantage in haemorrhagic stroke but not in major bleeding. Showing a higher risk of death in the ld-rivaroxaban and ld-apixaban cohort, our results are consistent with Nielsen et al. The small differences to the results of Nielsen et al. could hence be partially triggered by comparing ld-DOACs to warfarin instead of phenprocoumon.

Our study generates the hypothesis that ld-DOAC therapy, as practiced in Germany today, might be inferior to phenprocoumon. In the absence of an RCT with phenprocoumon, further research is needed to affirm or refute this hypothesis. Future research questions should focus on, whether the beneficial effect of phenprocoumon, as shown in our study, is triggered by an inappropriate use of ld-DOAC and/or by differences between phenprocoumon and warfarin.

Limitations

As in any retrospective observational study using claims data, adjustments and matching could only be based on information available in the data. Therefore, residual bias driven by undocumented information cannot be ruled out. Coding of diagnoses is not perfectly accurate, especially for smoking status (in Germany frequently coded as a diagnosis (ICD-10 F17.1)) or obesity [44]. The effect of TTR cannot be calculated, as data did not provide information of the results of INR testing. The physicians’ rationales for prescribing a reduced dosage cannot be derived from the data. The effect of the TTR on the outcomes and the proportion of patients with an inadequate reduced dosage cannot be determined. As a strength of all real-world studies, the results potentially better reflect the actual health care situation by its large and more representative study population with wide selection criteria in comparison to RCTs.

Conclusion

Our data revealed statistically significant lower rates of thromboembolic events and death for phenprocoumon without a statistically significant increase in major bleeding despite a high number of patients analysed. As a hypothesis, phenprocoumon might be superior to ld-DOAC regime as practiced today. Due to the natural limitations of real-world studies the results of our study should be evaluated by RCTs comparing ld-DOACs to phenprocoumon. As long as these RCTs don’t exist, real-world studies form the highest level of evidence available if it comes to comparing ld-DOACs with phenprocoumon. According to them, phenprocoumon might be the better choice than ld-DOACs for high-risk patients with AF.