Background

Atrial fibrillation (AF) prevalence and incidence increase with age [1]. In addition, AF is associated with an increased risk of cognitive decline and dementia [2], independently of shared risk factors or overt stroke. Several mechanisms might explain a causative role for cognitive impairment among individuals with AF, such as silent brain infarctions, cerebral microbleeds, and hypoperfusion [3, 4]. Despite limited and conflicting evidence [5], large observational studies [6, 7] support a role for oral anticoagulation to reduce the risk of dementia in AF patients, for whom effective therapeutic agents to mitigate its burden on healthcare systems [8] are needed.

Long-term oral anticoagulation therapy is currently recommended for patients with AF and a moderate-to-high risk of stroke, with non-vitamin K oral anticoagulants preferred over vitamin K anticoagulants (VKA) [1], due to significant risk reductions of systemic embolism and hemorrhagic stroke [9]. Cognitive outcomes, however, were not assessed in the pivotal randomized clinical trials (RCTs) that support this recommendation [10], including dabigatran etexilate, that was shown to be non-inferior to warfarin for the prevention of stroke and systemic embolism in the RE-LY trial (The Randomized Evaluation of Long-Term Anticoagulation Therapy) [10]. Observational studies added further uncertainty regarding the best strategy to prevent cognitive decline in patients with AF, as comparisons between non–vitamin K oral anticoagulants and VKA yielded different results [7, 11,12,13]. Since dabigatran has the theoretical advantage of a more stable anticoagulation status as compared to warfarin, it could improve cognitive related outcomes in patients with AF and at-risk of cognitive decline. Therefore, we conducted the CoGnitive Impairment Related to Atrial Fibrillation (GIRAF) randomized hypothesis generating trial comparing the use of dabigatran with warfarin in older adults with nonvalvular AF. We hypothesized that dabigatran would reduce cognitive decline, assessed by extensive cognitive test, independently of stroke.

Methods

Study design

The GIRAF trial was a 24-month, randomized, parallel-group, controlled, open-label, hypothesis generating trial, to compare dabigatran with warfarin in patients with AF or atrial flutter that was conducted in Sao Paulo, Brazil. The study protocol was approved by the local ethics committee and complies with ethics principles from the Declaration of Helsinki and International Conference on Harmonization Good Clinical.

GIRAF trial is an investigator-initiated research, partially funded by Boehringer Ingelheim do Brasil, which also provided dabigatran. The sponsor had no role in study design, trial execution, data analysis, writing/reviewing the manuscript, or in the submission for publication. Clinical Trial Registration: NCT01994265 (URL: www.clinical.trials.gov)

Patients

Patients who were being followed at six centers in Sao Paulo (including a geriatric care unit, secondary and tertiary care cardiology hospitals), were invited to participate in the trial, but all the study procedures, including the final screening process, randomization, clinical and neurologic follow-up, and endpoints assessment, were performed at one site (Instituto do Coracao, HCFMUSP, Sao Paulo). Eligible patients were 70 years or older, had a history of AF or atrial flutter documented by a conventional 12-lead electrocardiogram (ECG) or by an ECG strip with duration of 30 seconds or longer, and had a CHA2DS2-VASc score of 2 or higher. Key exclusion criteria were illiteracy or less than 4 years of education, severe valvular heart disease (defined as any of the following anatomically severe valvular heart disease, per echocardiogram with compatible physical findings and cardiac auscultation: aortic stenosis/regurgitation, mitral regurgitation/stenosis, pulmonary regurgitation/stenosis or tricuspid regurgitation/stenosis), diagnosis of dementia (based on clinical judgment by the neurologist and on MMSE scores below education-adjusted norms for the Brazilian population), previous stroke or transient ischemic attack, severe liver disease, chronic kidney disease grade KDIGO 4 or worse (estimated glomerular filtration rate < 30 ml/min/1.73 m2), and major contraindications to oral anticoagulation. Full details of inclusion and exclusion criteria are available in the supplementary appendix.

Randomization

After eligible patients provided informed consent, they were randomized 1:1 via a randomization program using the Research Electronic Data Capture (REDCap) system, to receive either open label dabigatran 150 mg or 110 mg twice daily (110 mg dose for patients ≥ 80 years or with an eGFR between 30 and 50 mL/min/1.73m2) or warfarin once daily titrated to achieve an international normalized ratio (INR) of 2.0 to 3.0.

Procedures

Up to 15 days after randomization, all patients went through baseline cognitive evaluation. For patients who were using an oral anticoagulant before randomization other than its group assignment, switching to dabigatran was performed according to current guidelines [14]. For switching from dabigatran (or other non-vitamin K oral anticoagulant), warfarin was started according to the creatinine clearance: if ≥ 50 mL/min, 3 days before discontinuing non-vitamin K oral anticoagulant C, and 2 days before discontinuing it if the creatinine clearance was between 30 and 50 mL/min.

A pre-specified, comprehensive, and thorough cognitive evaluation for different cognitive domains was performed at baseline and at 24 months, based on the recommendations of the National Institute of Neurological Disorders and Stroke-Canadian Stroke Network Vascular Cognitive Impairment Harmonization Standards [15]. The Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA) were administered as brief measures of global cognitive functioning. In addition, participants were submitted to a neuropsychological test battery (NTB), including the following tests: Trail-Making tests A and B, short form (15 items) of the Boston naming test (BNT), clock drawing test (CDT), digit symbol substitution test (DSST), phonemic verbal fluency test (FAS), semantic verbal fluency test (SVF; animals/minute), and the Figure Memory Test (including immediate, learning, and delayed recall). Participants also underwent computer-generated neuropsychological tests (CGNT), which evaluated simple reaction time and sustained, selective and divided visual attention, with measures of accuracy (i.e., percentage of correct responses) and reaction time (in milliseconds). A detailed explanation of the CGNT can be found elsewhere [16]. All the tests have been used previously in Brazilian Portuguese versions. Cognitive evaluations lasted approximately 90 min and were performed by two neurologists, blinded to group assignments, in separate visit days from clinical consultations. Details regarding the through cognitive evaluation are provided in the supplementary appendix.

In patients randomized to open-label warfarin, INR was measured weekly until the INR goal, then bi-weekly and monthly if the drug dosing was stable and the INR remained within target range (2.0 to 3.0). The time that the INR was within the therapeutic range during the trial was calculated with the use of the method of Rosendaal et al. [17]. Clinical consultations were performed every 3 months for both groups.

Outcomes

Primary outcomes were changes in cognitive performance at 24 months from baseline, measured with MoCA, MMSE, NTB, and CGNT scores as each test analyzes specific cognitive domains. Importantly, despite analyzing different domains of cerebral performance, all tests analyze cognitive function, that represents the main outcome of our study. The NTB and CGNT (accuracy and reaction time measures) scores were calculated as composites Z-scores, by averaging individual tests’ Z-scores weighted according to the number of available tests per patient. Prior to calculation, all components (tests) were standardized to indicate a better performance with higher scores (e.g., by using the negative of reaction time for the CGNT components). The minimum necessary number of components for calculating a patient’s composite score was set to six for the NTB score and seven for the CGNT. The respective value of the composite score was considered missing if the minimum number components condition was not met. Exploratory outcomes, based on post hoc analyses, were changes in cognitive domain scores for executive functioning, attention, language, and memory at 24 months in comparison to baseline. The executive functioning domain included the CDT, trails A and B. The attention domain included DSST and all CGNT tests. The language domain included the BNT, FAS, and SVF tests. The memory domain included the Figure Memory Test. The minimum necessary number of components for calculating the composite score was two for the executive functioning, language, and memory domains, and seven for the attention domain; a missing value was assigned otherwise. For all tests, cognitive decline was defined as any decline in Z-scores over time. Additional methods that were performed for neuroimaging for the diagnosis of silent stroke and biomarker assessments are described in the supplementary appendix.

Statistical analyses

On the basis of a post-hoc analysis of two randomized controlled trials [18] and on clinical practice expertise of the authors, we estimated a mean drop of 2 points in the MMSE score after 24 months with a standard deviation of 2 points. Assuming a 10% dropout rate and similar between group differences at 24 months for all primary outcomes, we calculated that a sample size of 200 patients would provide our study a 80% power to detect a 50% difference of change in cognitive scores (measured by any of the primary outcomes) in patients treated with dabigatran compared to warfarin. These estimate were later further supported by a study [19] that estimated a mean 0.2 drop in the NTB Z score with standard deviation of 0.5.

Primary analysis was conducted according to the modified intention-to-treat (mITT) population, including all patients who underwent both baseline and 24-month cognitive evaluations, censoring for patients who had stroke or other cerebrovascular events throughout the study. Additional sensitivity analyses were also performed to test the consistency of our findings: first using a per-protocol analysis on the mITT population (excluding patients that switched or stopped their oral anticoagulation during the 24-month period and including all randomized patients that underwent the first cognitive evaluation) and second using regression-based multiple imputation to estimate missing values for at 24-month evaluations. A linear regression was carried out for each primary outcome with treatment (D or W, 0/1 coded), age (years), education (log of years), and baseline raw score as covariates, with no interaction factors. After individual analyses of the relationship of covariates and dependent variables, we found only weak linear relationships. Additional analyses of the residuals of the linear regressions disclosed no major discrepancies to the standard assumptions. We report the results as least-square mean changes from baseline for each group (higher scores indicate better cognitive performance) and as the difference between-groups, at baseline and 24 months, with 95% confidence intervals. Cohen’s d standardized size effects are also reported based on the mean treatment difference between groups and residual standard deviation. The confidence intervals and P-values reported refer to a two-sided alpha of 0.05 with no correction for multiple hypothesis testing. To account for the increased risk of a type 1 error in the multiple comparisons of primary endpoints, adjusted P-values were also computed using Hommel’s method and reported in the “Results” section. All statistical analysis were performed using the R software, version 4.1.2 (R Foundation for Statistical Computing), and graphics were elaborated using GraphPad Prism version 9.3.0 for Windows, GraphPad Software, San Diego, California USA, www.graphpad.com.

Results

Between November 7, 2014, and March 10, 2019, 5523 participants were screened and 200 patients already on previous anticoagulation for the prevention of stroke were randomly assigned to either dabigatran (N = 99) or warfarin (N = 101) treatment. The major reasons for ineligibility were prior valvular heart disease (28%) and prior TIA or stroke (18%). A full list of ineligibility criteria is shown in the supplementary appendix. The mITT analyses included 149 patients who completed the 2 years cognitive assessment (Fig. 1). There were no significant between-group differences at baseline regarding age, sex, years of education, MMSE, HAS-BLED and CHA2DS2-VASc scores. MoCA, NTB, and CGNT scores, however, were different between groups at baseline (Table 1, and see Additional file: Table S3-S11, Table S12 [20,21,22,23], and Figures S4-S10).

Fig. 1
figure 1

GIRAF study flowchart. The patient flowchart depicts those who completed the 2 years cognitive assessment, dropout, and developed intolerance to medication. AF, atrial fibrillation; AFL, atrial flutter; TIA, transient ischemic attack

Table 1 Baseline characteristics of the patients included in the GIRAF trial. Data are depicted according to arm allocation for patients that completed the 2 years cognitive assessment (mITT population, no imputation). Numbers indicate median (IQR) for non-normal continuous variables, mean (standard deviation) for continuous variables, and number (percentage) for dichotomous variables. Normality was assessed by a Shapiro-Wilks test at 5% significance

Primary outcomes

Mean changes from baseline in each group, reported as least-square means (± SE), between-group differences with 95% confidence intervals, respective (unadjusted) P-values, and Cohen’s d effect sizes for between-group differences are shown in Table 2.

Table 2 Mean changes from baseline in dabigatran and warfarin groups for the primary cognitive outcomes. Data report the marginal effects (least-squares mean change from baseline score) adjusted for age (in years), log of years of education, and raw baseline score (mITT population, no imputation). Contrast values are between-group differences in the least-square mean change (dabigatran–warfarin). A positive value of contrast indicates a relative improvement (or smaller cognitive decline) of the group treated with dabigatran. There was no correction for multiple testing. Cohen’s d shows the effect size (contrast) as a proportion of the variation (residual standard deviation) of the adjusted least-square mean change

After controlling for age (in years), log of years of education, and raw baseline score, the difference between the mean change from baseline at 24 months in the dabigatran group minus warfarin group was not statistically significant for the MMSE, NTB, and CGNT scores. For CGNT, accuracies and reaction times of visual attention tests also failed to show a significant difference between the two study groups. For the MoCA score, we observed a significant difference when no correction for multiple testing is performed, suggesting less cognitive decline in the warfarin group. Figure 2 depicts the adjusted mean changes between groups from baseline estimates (points) and 95% confidence intervals (segments) for the four primary outcomes. Using Hommel’s correction for multiple comparisons, we obtained adjusted P-values of 0.74 for MMSE, 0.08 for MoCA, 0.74 for NTB, and 0.66 for CGNT, showing that detected between-group differences are not statistically significant.

Fig. 2
figure 2

Primary cognitive outcomes in dabigatran and warfarin groups. There were no significant differences between dabigatran and warfarin treatment groups for most of the cognitive tests at 2 years (except for MoCA) in comparison to baseline. Comparison between D and W groups for the four tests that represent the primary cognitive outcomes. Differences between groups (95% CI) are expressed in the adjusted mean change from baseline (points) and 95% confidence intervals (segments) for the four primary outcomes

Exploratory outcomes

Cognitive decline per domain

There were no significant differences between dabigatran and warfarin treatment groups for all cognitive domains at 2 years in comparison to baseline (Fig. 3). No patient was diagnosed with dementia during the study.

Fig. 3
figure 3

Exploratory cognitive outcomes in dabigatran and warfarin groups grouped by cognitive domains. Differences between dabigatran and warfarin treatment groups are depicted from baseline estimates (points) and 95% confidence intervals (segments) in the outcomes grouped by cognitive domains

Cognitive decline per anticoagulation quality in the warfarin group

Time in therapeutic range (TTR) in the warfarin group was 69.9% (± 13.9). In a post hoc analysis, no significant interaction was seen for the primary outcomes in the subgroup with TTR (≥ 70%), as shown in Additional file: Table S3.

There were 14 deaths during the study (five deaths in the dabigatran group and nine deaths in the warfarin group, P = 0.61). Among these deaths, seven were non-cardiovascular deaths (three in the dabigatran and four in the warfarin group, respectively) and seven CV deaths (two in dabigatran and five in warfarin group, respectively). Deaths were confirmed by death certificates and the cause of death were investigator-reported. We observed one episode of transient ischemic attack (TIA) and one stroke in patients from the Warfarin group. Four patients developed intolerance for dabigatran and one for warfarin (Fig. 1) and were excluded from the analyses.

Discussion

GIRAF is the first randomized prospective and controlled trial comparing anticoagulant strategies in patients with AF or atrial flutter at risk of cognitive decline. The results of the analyses of the mean change from baseline in the MMSE, MoCA, NTB, and CGNT scores did not support our hypothesis that dabigatran would attenuate cognitive decline compared to warfarin, as no evidence of a beneficial effect of dabigatran was found between groups.

An extensive 90-min neuropsychological evaluation protocol with different tests was designed for the GIRAF trial to capture minimal differences in cognitive function between groups over time. The full trial protocol is available in the supplementary appendix. The use of a comprehensive range of tests grants to the GIRAF study a unique characteristic that distinguish it from the previous AF clinical trials evaluating cognitive function. The evaluation includes tests of global cognitive evaluation (MMSE, MoCA and NTB) and tests for specific cognitive domains, including the CGNT battery.

The exclusion of patients with previous stroke or dementia after baseline cognitive evaluation aimed to mitigate consequences of events before randomization. Although there was a significantly difference favoring the warfarin group in the MoCA score at 24 months, that was not confirmed in the more exhaustive and comprehensive cognitive tests (NTB, CGNT) nor in the exploratory outcomes of cognitive domains (memory, executive function, language, and attention). Therefore, caution should be used when interpreting the results of the MoCA score separately.

Through more stable and predictable pharmacokinetics [24], dabigatran could be more effective to prevent than VKAs to prevent cognitive impairment by reducing both thrombus formation/cerebral micro-embolism and cerebral microhemorrhage. Several prior observational studies suggested that AF patients receiving non-vitamin K oral anticoagulants were less likely to be diagnosed dementia [12, 13] or the combination of stroke, TIA, and dementia, compared to VKAs users. Other studies, however, showed similar risks of dementia with warfarin and non-vitamin K oral anticoagulants [7, 11]. These conflicting results might be explained by inherent limitations to study design, such as residual confounding, unknown baseline cognitive status, misclassification, and stopping/switching oral anticoagulants during the follow-up period. Few studies [11] provided a direct comparison between VKA and non-vitamin K oral anticoagulants on the risk of specific subtypes of dementia (e.g., vascular dementia and Alzheimer´s disease) and understanding of the mechanisms behind cognitive protection from non-vitamin K oral anticoagulants are largely putative. Our cognitive domain analysis, evaluating the relative impactive of dabigatran and warfarin in memory, executive function, language, and attention, was designed to provide key information to address this knowledge gap. We observed no significant differences between study groups in studied cognitive domains.

The lack of benefit from dabigatran in our trial might be related to a very well-managed warfarin administration. GIRAF trial patients randomized to warfarin had a TTR of roughly 70%, higher than in previous pivotal studies of non-vitamin K oral anticoagulants [10] and strikingly divergent from real-data in anticoagulation quality [25], especially in low and middle-income countries [26,27,28], with TTR levels as low as 23%. Also, even in patients with adequate TTR, stability over longer periods is unknown [29]. Indeed, observational studies suggests an association of warfarin therapy quality and cognition: both poor control [30] and supra-therapeutic [31] INRs are associated of an increased risk of dementia.

Cerebral hypoperfusion, inflammation, and AF-induced neuroendocrine disturbances are also proposed mechanisms [3, 32] underlying the increased risk of dementia in patients with AF that were not addressed in the GIRAF trial. Ongoing randomized clinical trials, evaluating the effects of different interventions on cognitive function [32] in patients with AF, will also support an in-depth understanding of this complex interaction between putative mechanisms and cognitive dysfunction.

Our study has important limitations. First, fewer patients in the warfarin group completed the 24-month cognitive assessment, due to an increased dropout rate, which could have biased the treatment effect. However, because we considered only patients with cognitive evaluation at baseline and 24 months in the mITT analysis with the primary outcome being the difference within each patient for the cognitive tests, used a linear model for the analysis of the co-primary outcomes, and performed two sensitivity analyses that were consistent with our main findings, we do not believe that our high attrition rate affected the observed differences between study arms. In addition, prior trials [33,34,35] testing interventions for cognitive decline had similar attrition rates.

Second, since cognitive decline was lower than expected in the warfarin group and the expected size effect of dabigatran in attenuating cognitive decline was not observed, the trial was underpowered to show a between-treatment difference. The very adequate anticoagulation regimen with warfarin in the GIRAF trial (TTR of 70%) could have an effect, protecting patients against greater cognitive decline.

Third, we included mainly patients with low educational level, a known risk factor for dementia in early life [8], and these results should not be extrapolated to other populations. Fourth, as we had very strict inclusion and exclusion criteria, the impact of different anticoagulant strategies in subgroups of patients that were excluded according to GIRAF trial design (such as patients with valvular heart disease) is not determined by our findings.

Finally, we cannot exclude that a 24-month window for cognitive evaluation was inadequate to examine if dabigatran would have a favorable effect in cognition, and studies with extended follow-up periods are warranted. Notably, despite analyzing only Alzheimer’s disease patients, prior randomized trials were able to demonstrate an intervention effect in cognition after 24 months [19, 35].

The GIRAF trial has also several strengths: the extensive and thorough cognitive evaluation to assess global cognitive performance and different cognitive domains and a prospective evaluation of cognition in a randomized controlled trial. Although a first step has been made in how to measure cognitive function in patients with AF [31], there is no gold standard for the ideal combination of tests that should be selected in randomized clinical trials. We believe that the GIRAF trial helps the pace of progress, as the NTB test selection and innovative CGNT can provide an acceptable standard for future trials.

Conclusions

In conclusion, for elderly patients with atrial fibrillation, and without cognitive impairment at baseline, who did not have stroke and are adequately treated with warfarin (TTR of 70%) or dabigatran for 2 years, there was no difference in most of the cognitive outcomes. As GIRAF is hypothesis generation trial that adopted unique methods for cognitive evaluation, these findings could sow the seeds of future exploration and research in this area.