FormalPara Key Points for Decision Makers
Table 1

1 Introduction

Atrial fibrillation (AF) raises the risk of ischemic stroke fivefold, but this risk is largely reversed by prophylactic use of long-term anticoagulation [1]. However, because AF can be asymptomatic, patients with AF may not be diagnosed and treated, with the consequence that some will sustain a potentially preventable stroke. Such concerns motivate screening programs for AF [2,3,4,5]. Opportunistic screening by electrocardiogram (ECG) in individuals aged > 65 years during medical encounters is recommended by leading organizations such as the American Heart Association (AHA), the American Stroke Association (ASA), and the European Society of Cardiology (ESC) [2, 4]. Nonetheless, recent reports indicate that screening for AF is inconsistently used, and when used, may yield false negative results [2,3,4, 6].

A variety of technologies have been developed for extended systematic screening, which rely on taking multiple ECG measurements over several days (as opposed to at a specific point in time once) to reduce the false negative rates of AF detection [5, 7,8,9,10,11,12,13]. One such technology, the Zenicor hand-held device, was recently evaluated in the prospective STROKESTOP study in Sweden [13, 14]. STROKESTOP reported a new diagnosis of AF in 3% of individuals screened, of whom 90% initiated anticoagulant therapy [13].

Based on this evidence, the AF-Screen International Collaboration suggests that intermittent screening with handheld devices in individuals aged ≥ 75 years or in younger individuals with a high risk of AF or stroke, in addition to single time point screening of individuals aged ≥ 65 years may be warranted [15]. Nonetheless, this recommendation has not been adopted by all task forces and collaborations. The US Preventive Services Task Force (USPSTF) conducted a systematic review of screening for non-valvular AF (NVAF) in patients aged ≥ 65 years, and reported that current evidence is insufficient to determine the net benefit of screening with ECG in asymptomatic adults [8, 16]. Their primary concern was that no study had yet demonstrated a reduction in stroke rate resulting from AF screening. Similarly, authors of a recent review that discussed the rationale behind AF screening concluded that additional research is required to identify optimal screening strategies (i.e. whether single time point, intermittent, or continuous monitoring would be optimal) [5]. Very large trials with long follow-up have been initiated to definitively demonstrate the net clinical benefit of AF screening [17, 18].

In the shorter-run, simulation modeling can provide estimates of the clinical benefit and cost effectiveness of alternative screening strategies [8, 17, 19,20,21]. Numerous health economic models have been developed to determine the potential long-term trajectory of patients screened for AF versus no systematic screening, in terms of oral anticoagulation uptake and occurrence of thromboembolic and major bleeding events and their consequent impact on costs [9, 22,23,24,25]. Although these studies have demonstrated that opportunistic or systematic screening for AF may lead to long-term clinical benefits at an economically acceptable additional cost, these evaluations have been conducted thus far in non-US settings [9, 22,23,24,25]. The potential long-term clinical benefits and cost effectiveness of systematic screening for AF remains unclear from the perspective of a US third-party payer [8, 17].

Against this background, our study evaluates the clinical and economic value of systematic screening for NVAF in the USA. We used a health economic model to assess the cost effectiveness of a one-time screening for AF with either a 12-lead ECG or intermittent screening for 14 days with a hand-held, single-lead ECG device followed by treatment with an oral anticoagulant (OAC) upon NVAF detection compared with no screening in individuals who were aged ≥ 75 years.

2 Materials and Methods

2.1 Model Structure

A previously published Markov cohort model that assessed anticoagulation treatments for NVAF was extended to incorporate screening for NVAF in this study [26]. The model was developed in Microsoft Excel® using a 13-weekly cycle length. Three identical cohorts with no known NVAF who were 75 years of age were simulated over their lifetime horizons. At the beginning of the model, the first cohort was screened once with 12-lead ECG, the second was screened for 14 days with Zenicor single-lead ECG (not available in the USA) (Z14), and the third did not undergo screening (no screening).

Based on screening results, all patients with a true positive diagnosis of NVAF were assigned treatment with an OAC, if eligible (e.g. no absolute contraindications such as intracranial hemorrhage), and patients with a true and false negative or false positive diagnosis were not assigned treatment (Fig. 1). Patients with a true positive diagnosis and a false negative diagnosis were then subjected to the risk of the following events: ischemic and hemorrhagic strokes of varying severity, systemic embolism, major and non-major bleeding events, myocardial infarction, treatment discontinuation, and death [26]. Treatment with OAC was assumed to reduce the risk of thromboembolic events but increase the risk of bleeding. In all three cohorts, patients with an initial false negative diagnosis could be diagnosed after the initial screening by routine monitoring (background detection). Conversely, patients with an initial true positive diagnosis could have discontinued OAC treatment and received no treatment. Patients with a true negative or false positive diagnosis (as determined by a confirmatory test) were not subjected to the risk of these clinical events as they represent the proportion of the cohort without NVAF, unaffected by screening. As patients without AF have the same health and economic outcomes, and represent equal proportions of the cohort in each screening arm, their outcomes have no effect on the incremental results. However, the risk of developing NVAF was explicitly modeled.

Fig. 1
figure 1

Simplified schematic of decision process in screening model. NVAF non-valvular atrial fibrillation, OAC oral anticoagulant

2.2 Model Inputs

2.2.1 Epidemiology Inputs and Screening Efficacy

Based on data from the STROKESTOP study [13], we assumed that the prevalence of permanent undiagnosed NVAF was 0.5% and that it would always be detected regardless of screening modality. The true prevalence of undiagnosed paroxysmal NVAF remains largely unknown, and detection rates depend on the screening method. For consistency, we used data from the STROKETOP study that previously reported undiagnosed paroxysmal NVAF will be found in 2.5% of patients aged ≥ 75 years when screening with the Zenicor device (Zenicor) intermittently for 14 days. In other words, overall prevalence of NVAF was assumed to be 3% with 83.33% of cases being paroxysmal (Table 1) [13]. Based on evidence from the STROKESTOP study, one-time screening with 12-lead ECG was assumed to detect 20.9% of the NVAF cases that were found when screening with Zenicor for 14 days for a detection rate of 0.63% Assuming that the 0.5% permanent cases would always be detected, the remaining cases ([0.63%–0.50%]/0.63%) would be paroxysmal [13, 14]. Among patients with NVAF, it was assumed that 5% would be detected every year through background detection [22].

Table 1 Inputs related to alternative screening strategies

Specificity of 12-lead ECG was assumed to be 100%, since this represents the gold standard and reference in most screening studies [9, 27]. No information was identified regarding the specificity of Zenicor used over 14 days. The results from a study that evaluated asymptomatic AF in 100 patients who registered their cardiac rhythms with 12-lead ECG immediately followed by Zenicor suggested a specificity of 92%; however, these estimates were based on a single time point screen [27]. It is unclear whether specificity would improve over an extended period due to repeated screens. Therefore, in the base case, we assumed that specificity of extended screening would be 100%. Alternative estimates were examined in scenario analysis.

2.2.2 Risk of Clinical Events

Patients with paroxysmal and permanent NVAF were assumed to have different risks of ischemic stroke, but equivalent risks of other thromboembolic and bleeding events [28]. The annual risk of progression from paroxysmal to permanent was assumed to be 5.54%, taken from a systematic review and meta-regression of randomized control studies and cohort studies that reported data on the rate of progression of AF [29].

The risk of experiencing thromboembolic and bleeding events for patients receiving treatment with an OAC were based on post-hoc analyses of the previously published ARISTOTLE trial (Table 2) [26, 30]. In the base-case analysis, patients identified with NVAF were assumed to be treated with apixaban, based on evidence of cost-effectiveness versus other novel OACs (NOACs) and vitamin K antagonists (VKAs), but other treatments were tested in scenario analyses [31, 32]. Because head-to-head data comparing NOACs with no treatment were unavailable, an estimate of hazard ratio of clinical events associated with no treatment versus OAC-treated patients was obtained from a meta-analysis [33]. The hazard ratio was multiplied with the event rates for OAC-treated patients to calculate the event risks for patients who do not receive OAC treatment.

Table 2 Risk of experiencing clinical events in patients with NVAF who are receiving apixaban and patients not on treatment

The annual risk of OAC discontinuation for reasons unrelated to stroke, bleeding, myocardial infarction, and systemic embolism was 13% per year in the first year (secondary analysis of ARISTOTLE trial). This risk was reduced by 30% each year thereafter based on the assumption in the STROKESTOP cost-effectiveness analysis [22]. Because real-world discontinuation rates are reportedly higher, we set the first year OAC discontinuation rate (unrelated to stroke and bleeding) to 50.5% in a scenario analysis [34]. Use of this estimate was considered to overestimate discontinuations, since the estimate does not exclude OAC discontinuations related to stroke, bleeding, myocardial infarction, and systemic embolism, which are explicitly modeled.

The risks of subsequent stroke, distribution of stroke by severity, and major bleeding events across different severities, and excess mortality as a result of clinical events were based on previous modelling efforts (Online Resource Table 1) [22, 26]. Background general mortality was based on 2014 US life tables [35]. For patients with AF, a hazard ratio was applied to background mortality rates, reflecting the increased risk of mortality versus the general population after accounting for events modelled [26].

2.2.3 Utilities and Costs

Health state utility values were obtained from a catalogue of EuroQol 5 dimension (EQ-5D) scores using the Medical Expenditure Panel Survey in the US, consistent with the approach utilized in the primary model (Online Resource Tables 2 and 3) [28, 36, 37]. Conservative utility values for stroke were used in this study for EQ-5D; however, utility values estimated based on time trade off and standard gamble methods resulting in a more pronounced estimate of the impact of stroke on utility were examined in the scenario analysis [38, 39]. Costs were expressed in 2016 US dollar values. Treatment with apixaban was assumed to cost $4383 annually based on a list price of $6.47 per 5 mg tablet [40, 41]. Acute care costs associated with thromboembolic and bleeding events were obtained from the Centers for Medicare and Medicaid Services hospital inpatient prospective payment systems (fiscal year 2016), which is a conservative estimate of costs, given that those data do not capture payments to physicians during the admission (Online Resource Table 4) [42]. Long-term rehabiliation and management costs were obtained from the Medical Expenditure Panel Survey of the Agency for Healthcare Research and Quality [43]. Costs and utilities were discounted at a 3% annual rate.

2.3 Validation

The face validity of the model was established by presenting the initial model concept, the Markov diagram, model inputs, and final model structure to experts in health economics and a practicing cardiologist. Upon the completion of the model, a comprehensive and rigorous quality check was performed including validating the logical structure of the model, mathematical formulas and sequences of calculations, and the values of numbers supplied as model inputs, to ensure the model was technically valid. Cross-validity of the model was checked by comparing results projected from the model with those reported in the cost-effectiveness analysis conducted by the STROKESTOP authors [13].

2.4 Analyses

Outcomes were reported for a cohort of 10,000 patients compared with no screening over a time horizon equivalent to a patient’s lifetime. Health outcomes were evaluated in terms of AF cases detected, strokes avoided, and quality-adjusted life year (QALY) gained. Costs were presented as screening-related, anticoagulant-related, and medical event-related costs. Incremental costs and health outcomes were used to calculate incremental cost-effectiveness per QALY gained. Deterministic sensitivity analyses were conducted by setting parameter values to their lower and upper 95% confidence interval values one at a time. In probabilistic sensitivity analyses (PSA), all parameters were varied simultaneously (see Online Resource Table 5 for distributions). Scenario analyses were included to test the sensitivity of results to the treatment choice for patients with NVAF, background screening detection rate, undiagnosed NVAF prevalence, progression rate from paroxysmal to permanent NVAF, a lower specificity (92%) for Z14 and confirmatory screening with ECG 12-lead, alternative utility values, and higher OAC discontinuation rate for reasons unrelated to stroke, bleeding, myocardial infarction, and systemic embolism observed in the real-world setting.

3 Results

3.1 Validation

The adopted model structure, inputs and representation of uncertainty via scenario analyses was found appropriate by experts in health economics and a practicing cardiologist. The technical validation exercises, identified a few errors, which were corrected prior to conducting analyses. When compared to the STROKESTOP study, our model projected alternative estimates in life years gained, QALY gained, and prevented stroke. The discrepancies could be explained by (i) increased mortality included in our model for patients who experienced treatment-related events such as strokes, myocardial infarction, and systemic embolism (as opposed to no excess mortality in STROKESTOP model); (ii) alternative sources for utility values; (iii) greater granularity of medical events (e.g. degree of severity in strokes) and (iv) use of a lower risk of stroke among patients with paroxysmal AF in our screening model. When controlling for these differences, our model projected similar estimates to those reported by the STROKESTOP authors.

3.2 Deterministic Base-case Results

In a cohort of 10,000 patients screened at age 75 and followed over their lifetimes, the number of additional AF cases detected was 54 with ECG 12-lead and 255 with Z14 compared with no screening (Table 3). The screening yield was 0.72% with the ECG 12-lead and 3.44% with Z14. Identification of additional NVAF patients and earlier treatment with anticoagulation led to fewer ischemic strokes among cohorts screened (lifetime ischemic strokes avoided: ECG 12-lead, 9.8 and Z14, 42.2) and subsequently better health outcomes in terms of quality of life (QALYs gained: ECG 12-lead, 31; Z14, 131) compared with no screening. Increased OAC treatment as a result of screening with ECG 12-lead and Z14 led to 7 and 32 more lifetime major bleeds compared with no screening, respectively.

Table 3 Lifetime health outcomes

All screening strategies led to increased costs, mostly due to OAC treatment of newly detected AF cases. Screening also led to increased costs due to implementation of screening strategies and increased costs to manage bleeding events that would have resulted from an increased number of patients on OACs. Incremental OAC, screening, and bleeding treatment costs were partially offset by cost savings from prevention of strokes, systemic embolisms, and myocardial infarctions. Overall, total lifetime incremental costs compared with no screening were $1,808,709 with ECG 12-lead and $6,283,638 with Z14 (see Table 4 for absolute costs from each strategy).

Table 4 Base-case results—per patient cost outcomes (in USD)

All screening strategies were cost effective when compared with no screening at a willingness-to-pay (WTP) threshold of $100,000 per QALY gained. The incremental cost per QALY gained compared with no screening was $58,728 with ECG 12-lead and $47,949 with Z14. Screening with Z14 resulted in higher benefits in terms of QALYs gained compared with ECG 12-lead (100 additional QALYs), and a lower ICER compared with no screening. Therefore, ECG 12-lead was extendedly dominated in our analysis, since more QALYs could be gained versus no screening with Z14 at a lower cost (Online Resource Table 6).

3.3 Sensitivity and Scenario Analyses

Deterministic sensitivity analyses highlighted that cost-effectiveness estimates for screening compared with no screening were most sensitive to the uncertainty around the stroke risk among untreated patients and patients treated with OAC, while the risk of experiencing myocardial infarction among untreated patients was also influential (Fig. 2, Online Resource Fig. 1). The incremental cost-effectiveness ratio (ICER) was below the $100,000 threshold for all screening strategies for all sensitivity analyses, indicating that the results were robust to uncertainty around individual parameter estimates.

Fig. 2
figure 2

Deterministic sensitivity results—ICER Z14 versus no screening. AF atrial fibrillation, ECG electrocardiogram, HR hazard ratio, ICER incremental cost-effectiveness ratio, MI myocardial infarction, Z14 Zenicor screening for 14 days

Conclusions regarding cost effectiveness of screening remained the same in all but one of the scenario analyses which explored the results under different assumptions on key parameters. Varying the OAC used for patients diagnosed with NVAF, the background detection rate, the prevalence of undiagnosed NVAF, the progression rate from paroxysmal to permanent NVAF, and specificity of Zenicor device did not alter the base-case conclusions. Using alternative utility values for ischemic and hemorrhagic stroke health states resulted in lower ICERs for the two screening methods compared with the base case. When the annual discontinuation risk in the first year was set to 50.5%, the ICER for Z14 was still under the WTP threshold, but the ICER for ECG 12-lead increased to $132,945. For ECG 12-lead and Z14, the ICER reached $100,000 per QALY gained when the first year OAC discontinuation risk for reasons other than stroke, bleeding, myocardial infarction, and systemic embolism was set to 41.3% and 51.8%, respectively. The increased ICER with higher discontinuation rates demonstrates that the effectiveness of AF screening depends on adherence to OAC and that any AF screening program needs to be coupled to emphasis on prescription of OAC and persistence on OAC. Except for the higher discontinuation rate scenario (for ECG 12-lead), the ICERs across scenarios remained under the $100,000 per QALY gained threshold (Table 5).

Table 5 Scenario analysis—ICER versus no screening (in USD)

When all model parameters were varied simultaneously in PSA, screening had a high probability of being cost effective, compared to no screening (Figs. 3, 4). The vast majority of the simulation runs resulted in ICERs below the WTP of $100,000 for 12-lead ECG (88%) and Z14 (95%) (Fig. 3). Intermittent screening with Z14 had the highest probability of being cost effective at thresholds above $52,000 per QALY gained (Fig. 4).

Fig. 3
figure 3

Probabilistic sensitivity analysis results for a cohort of 10,000 patients 75 years of age. ECG electrocardiogram, QALY quality-adjusted life year

Fig. 4
figure 4

Cost-effectiveness acceptability curve. ECG electrocardiogram

4 Discussion

Using a Markov modeling approach, this study estimated the clinical impact and cost effectiveness of one-time systematic screening of individuals upon reaching the age of 75 from the perspective of the US healthcare system. We analyzed 2 screening modalities: a one-time 12-lead ECG and intermittent screening with Zenicor over 14 days as examined in the STROKESTOP study. Extended screening (Z14) was 4.8 times more likely to detect NVAF cases compared with one-time screening using 12-lead ECG. Both screening strategies (12-lead ECG and Z14) led to better health outcomes (i.e. additional number of strokes avoided and higher QALYs) compared with no screening.

We estimated that both screening strategies would increase net cost, as higher spending on OACs was only partially offset by lower medical costs. Both screening strategies were cost effective relative to no screening using a threshold of $100,000 per QALY gained, but the lower yield of screening with 12-lead ECG meant that the Z14 extended screening strategy had a more favorable ICER ($58,728 with 12-lead ECG and $47,949 with Z14).

Our study adds to the existing evidence assessing the cost effectiveness of systematic screening for AF [9, 22], whereas many economic evaluations to date have focused on opportunistic screening [23,24,25]. Consistent with earlier studies assessing systematic screening in non-US settings, we found that systematic screening for AF was associated with gains in health outcomes at an economically justifiable cost when assessed from a US health system perspective. To our knowledge, our study is the first to examine cost effectiveness of AF screening from a US health system perspective. In addition, our study is the first to examine the cost effectiveness of both single-time point and extended intermittent systematic screening. We estimate that extended intermittent screening can detect more NVAF cases and has lower incremental cost per QALY than single time-point screening.

Moreover, our study expanded upon previously published analyses, to consider a lower risk of stroke among screen-detected patients, based on the assumption that the majority of patients detected by intermittent screening with hand-held devices would have paroxysmal NVAF. This is at odds with earlier models [9, 22], which assumed the same risk of stroke for screen-detected patients, regardless of whether AF was paroxysmal or permanent. This enhancement in addition to incorporation of more granular mortality patterns, accounting for excess mortality beyond the acute period for ischemic and bleeding events, may result in a more accurate representation of the potential long-term health benefits associated with systematic screening.

Nonetheless, important areas for future research remain. Our analyses focused on the comparison of intermittent screening with Zenicor over 14 days versus one-time 12-lead ECG or no screening among a cohort of patients aged 75 years. The choice of screening device, frequency and age at screening in this analysis was dictated by the underlying evidence available [13]. However, there are numerous possible permutations of devices, screening frequency and age at screening, with limited guidance on which permutation constitutes the optimal strategy. For example, numerous devices in addition to Zenicor have been proposed to screen for AF including AliveCor, MyDiagnostick, WatchBP and Zio Patch among others. Most studies evaluating these devices assessed sensitivity versus 12-lead ECG at a single-time point (as opposed to over an extended period) or assessed detection rates without comparing to a reference method [44]. Further research is necessary to assess the comparative efficacy and cost effectiveness of these devices to understand the optimal screening strategy for AF. In a similar context, further evidence on the prevalence of undiagnosed AF—how it varies by age and how it can be modified by repeated screening—is required to ascertain whether repeated screening or screening at earlier ages would be effective.

4.1 Limitations

As with all simulation studies, our findings are indicative and need to be confirmed with empirical evidence. Specific limitations arise from our choice of assumptions and limited data availability. Considerable uncertainty remains regarding the risk of stroke among patients detected with intermittent screening as stroke risk appears to depend on the burden of AF [45,46,47,48,49]. For example, a retrospective study in untreated patients with paroxysmal AF suggested that a higher burden of AF identified by 14-day continuous ECG monitoring was associated with a higher risk of stroke or arterial thromboembolism [50]. The implication for our analysis is that extending the duration of screening from single time-point to extended intermittent captures NVAF in patients who presumably have lower disease burden and thus lower risk of stroke. Although we captured the differences in stroke risk between paroxysmal versus permanent NVAF and alternative detection rates of paroxysmal NVAF for extended screening, the differential NVAF burden in patients with paroxysmal NVAF was not considered. Evidence that is being generated from the STROKESTOP II trial, which will collect incidence of stroke over a 5-year period, may aid in understanding the risk of stroke in patients who are screened intermittently for AF and may identify the optimal duration of screening [17]. For the purposes of our analysis, thromboembolic and bleeding event risks and OAC discontinuation rates were based on the ARISTOTLE trial. This represents an additional limitation, as in the real-world setting, patients might have a different risk profile for these events compared with patients evaluated in a randomized clinical trial where improved monitoring and care may be received.

Similarly, there is considerable uncertainty surrounding the prevalence of undiagnosed NVAF in the US population; therefore, the prevalence of undiagnosed NVAF (detectable with Zenicor) was assumed to be equivalent to that observed in the STROKESTOP study [13]. Estimates used in our analysis are likely conservative, since the global burden of disease study conducted in 2010 noted a much lower prevalence of AF in Sweden (where the STROKESTOP study was conducted) compared to the US (400–475 vs 700–75 per 100,000) [51]. This hypothesis is corroborated by evidence from the mSToPS clinical trial, conducted in the USA, which identified AF in 3.9% of patients screened by 4 months [12]. As demonstrated in scenario analyses, cost-effectiveness estimates improved with higher prevalence estimates. However, conclusions on the cost effectiveness of the screening strategies in the current study were robust to a lower undiagnosed NVAF prevalence.

Furthermore, the costs of indeterminate readings were not considered in this study. Although the rate of indeterminate tests could be an important economic consequence of extended screening for NVAF, no data were identified regarding the proportion of cases in which screening is indeterminate, nor were data identified that reported the cost associated with an indeterminate reading. Because our analysis assumed that all patients screened would incur the cost of Zenicor allowing future use, we assumed that additional screens as a result of indeterminate readings would confer minimal costs. Nonetheless, Zenicor is a device that is not available in the USA, necessitating use of cost estimates for the USA converted from Swedish estimates [22]. For intermittent screening to result in a cost-effectiveness ratio above $100,000 per QALY gained, the cost of Zenicor would need to be above $836.12, almost six times our estimated value.

A further parameter with a high level of uncertainty consisted of the annual probability of spontaneous detection of undiagnosed NVAF in the absence of systematic screening. Scenario analysis suggested that all screening strategies remained under the WTP of $100,000 per QALY gained even when this annual probability was increased to 10%. In this scenario, additional AF cases detected decreased to 41 and 194 for ECG 12-lead and Z14, respectively, while the number of strokes avoided decreased to 7 and 30, respectively. Nonetheless, empirical evidence is required to assess the potential range of this parameter.

Finally, patients with NVAF who were eligible for OAC treatment in this analysis were assumed to be receiving apixaban. In clinical practice, patients would receive a mixture of NOACs and VKA therapy. Apixaban was chosen in the base case due to availability of patient-level data that would allow more granular evaluation of event rates and due to evidence of greater cost effectiveness with apixaban versus the other NOACs [52,53,54,55,56]. Use of a NOAC in the base case was also considered conservative due to the higher treatment costs associated with NOACs compared with warfarin [57].

5 Conclusions

Screening for NVAF has the potential to prevent NVAF-related strokes. In the Markov cohort model used in this study, screening for NVAF at age 75 with ECG 12-lead or Z14 was predicted to be cost effective at conventional thresholds compared with no screening. The results were sensitive to the risk of stroke, with a higher baseline risk of stroke that implied increased health gains and lower ICERs. These results suggest a favorable cost effectiveness for AF screening in individuals aged 75 years; however, further research is required to understand the risk of stroke and background detection rate among patients with undiagnosed NVAF detected with intermittent screening. These results indicate that extended screening for NVAF in elderly adults with previously undiagnosed NVAF is a cost-effective approach.