Introduction

Barrett’s esophagus (BE), thought to be a response to chronic inflammation due to gastroesophageal reflux disease (GERD), is defined by conversion of normal squamous epithelium of the tubular esophagus to a columnar-lined epithelium containing goblet cells [1]. Confirming the presence of BE has significant implications, since it is the only known precursor lesion of esophageal adenocarcinoma (EAC), a disease that has increased sixfold in incidence over the past three decades [2].

EAC is thought to develop via a GERD–metaplasia–dysplasia–cancer carcinogenic sequence [2]. Identifying individuals with BE through screening, and then enrolling them in surveillance programs to detect dysplasia and early EAC with endoscopic eradication as appropriate, is a potential method of reducing the incidence, morbidity, and mortality associated with EAC. Furthermore, in addition to detecting BE, screening also detects prevalent dysplasia and early EAC that can be treated with endoscopic eradication therapy. For these reasons, the overall consensus among the major gastroenterological societies is to recommend screening patients with chronic GERD at risk for progression to EAC, despite the lack of any randomized control trials that have evaluated this practice [3,4,5,6].

Screening and surveillance guidelines rely on targeted tissue sampling of any visible mucosal abnormality found on visual inspection during endoscopy, followed by random 4-quadrant forceps biopsies (FB) obtained at 1–2 cm intervals throughout the BE segment (the “Seattle protocol”). Yet, the Seattle protocol suffers from a multitude of limitations including sampling error with random biopsy protocols leading to missed dysplasia [7, 8], significant diagnostic variability in assessing the presence and grade of dysplasia by pathologists, which leads to improper diagnoses [9, 10], and well-documented variability in physician adherence to current guidelines on obtaining adequate biopsies during surveillance endoscopy [11]. These limitations of the Seattle protocol highlight the need for better modalities that improve the accuracy and cost-effectiveness of endoscopic screening and surveillance in order to reduce the morbidity and mortality associated with EAC.

Over the past few decades, there has been a significant paradigm shift in the management of dysplastic BE and early EAC with the emergence of effective endoscopic eradication modalities including radiofrequency ablation (RFA), cryoablation, and endoscopic mucosal resection [12, 13]. Therefore, most economic model studies evaluating the cost-effectiveness of BE screening also incorporate surveillance to account for the beneficial effects of treatment of BE with associated dysplasia and early EAC. Screening GERD patients for BE was found to be cost-effective in studies if the willingness-to-pay threshold was < $100,000 per quality-adjusted life year [14]. A recent analysis with incorporation of radiofrequency ablation found surveillance of BE to be highly cost-effective [15]. No cost-effectiveness screening study to date has considered newer approaches that improve the effectiveness of screening.

Sampling error with FB is associated not only with failure to detect dysplasia, but also goblet cells, resulting in false negatives in patients undergoing screening for BE [3, 4]. To overcome FB sampling error, wide area transepithelial sampling with three-dimensional computer-assisted analysis (WATS3D) was developed as an adjunct to the Seattle protocol to increase the effectiveness of BE screening and surveillance by increasing the detection rate of BE and BE-dysplasia and early EAC (CDx Diagnostics, Suffern, NY). WATS3D uses an abrasive brush, deployed during endoscopy, which is designed to sample a much larger and circumferential mucosal area of the esophagus compared to FB. The WATS3D specimen is analyzed by pathologists with assistance from a specialized three-dimensional (3D) computer analysis system that uses neural networks and artificial intelligence to identify the most abnormal cells and cell clusters on a given sample for presentation to the pathologist. Images identified by the computer are reviewed by pathologists in conjunction with manual microscopy and are reported utilizing standard pathologic criteria for the diagnosis of BE, dysplasia, and EAC.

There have been five prospective studies evaluating WATS3D as an adjunct to the Seattle protocol in both screening and surveillance populations [16,17,18,19,20]. All studies have shown increased yield of detection of BE and dysplasia/EAC. We developed a model to analyze the cost-effectiveness of using WATS3D as an adjunct to the Seattle protocol in patients with chronic GERD being screened for BE, compared to screening with the Seattle protocol alone.

Methods

Problem Definition and Analyses

A decision analytic model was created to compare the cost, effectiveness, and cost-effectiveness of two strategies for screening for BE: random 4-quadrant forceps biopsies alone (FB) vs. forceps biopsy with WATS3D (FB + WATS3D). The simulated cohort consisted of 60-year-old white males with GERD not previously screened for BE. Sixty years of age was used to leverage results of previous cost-effectiveness models for surveillance of BE sponsored by the National Cancer Institute [15]. Results were calculated with Microsoft Excel 2016 (Microsoft Corporation, Redmond, WA, USA) and TreeAge Pro 2019 (TreeAge Software, Williamstown, Massachusetts, USA).

Effectiveness was measured in quality-adjusted life years (QALYs), which weights years of life by their health state utilities. Health state utilities represent patient preferences for different states of health, ranging from 0 (death) to 1 (full health). Cost was taken from the third-party payer perspective. All costs were adjusted to 2019 US$ based on the Medical Care Services component of the Consumer Price Index [21]. Both cost and effectiveness were discounted at the standard 3% per year. Following recommendations, incremental cost-effectiveness analysis was performed using two thresholds for cost-effectiveness: $100,000/QALY and $150,000/QALY [22,23,24]. Supplementary analyses were conducted to determine the number needed to screen to avert 1 cancer, and the number needed to screen to avert 1 cancer death.

No Institutional Review Board approval was necessary as there was no use of individual-level data, and therefore no human subjects.

Overview of Model

BE detected by positive FB or WATS3D was referred to surveillance, with treatment of future dysplasia. Results of surveillance were taken from results of two published models found in the literature and on the National Cancer Institute’s Cancer Intervention and Surveillance Modeling Network (CISNET) website [15].

Because incremental cost-effectiveness analysis was used, analysis was only done to compute differences between the two strategies. When patients would have identical results for both strategies, cost and effectiveness were not calculated since they would cancel in the incremental analysis. Thus, no calculations were made for anyone with a positive FB screen since they would be sent for surveillance in both strategies. Similarly, anyone with a negative FB and a negative WATS3D would not go into surveillance regardless of strategy. Patients with a negative FB and discordant positive WATS3D would be entered into a surveillance protocol. These cases had to be modeled and the cost and effectiveness calculated. For cases of true negative FB but false positive WATS3D, we assumed they would go into surveillance with FB and later be removed after two rounds of negative FB “confirmed” the false positive status of the original WATS3D screen. For cases of false negative FB and true positive WATS3D, we assumed they would remain in surveillance with future FB surveillance confirming the presence of BE. A summary of how each case is handled is found in Table 1.

Table 1 Overview of how each screening approach handles different test results

Model Parameters

All input parameters for the model are described below and summarized in Table 2.

Table 2 Input parameters

While there are many estimates of the prevalence of BE in the general and GERD populations, there is a paucity of literature on the proportion of screenings with FB that result in a positive result. Rubenstein et al. used a database of over 150,000 patients undergoing their first endoscopy and presented the results of BE detection by sex and GERD as indication for endoscopy, by decade of life [25]. For white males with GERD, the proportion with FB positive results were 6.0% for age 40–49, 9.3% for age 50–59, and about 8.4% for age 60–69. Averaging the results for age 40–49 and 50–59, then 50–59 and 60–69, this would result in estimates of 7.65% for age 50 and 8.85% for age 60. We used 8.85% for our base case value, and 7.65% for the low end of the range for sensitivity analysis. For the high end, we used the value of 10.05%, which is the same distance from the base case as the low end of the range.

Adding WATS3D to FB has been shown to increase the yield of positive screens for BE. Over time, changes, including increased brush size and optimized computer algorithmic analysis with further machine learning, have improved performance of WATS3D. For our model, we chose to use data from the most recent and largest study of WATS3D, which incorporated the enhanced three-dimensional computer analysis system and the larger sampling brush [19]. In this study, WATS3D increased the overall detection of BE by 213% when used adjunctively in screening [19]. Given the uncertainty around these key parameters, we chose to calculate incremental cost-effectiveness analysis results across a wide range of values for these parameters. For additional yield, we used the result from Smith et al., [19] along with one-half and one-third of the published result as possible outcomes for the model (213%, 106.5%, 71%). The false positive rate for WATS3D benefits from only being used by WATS3D expert pathologists at one laboratory. However, no data currently exist to estimate the rate of false positives for WATS3D. For false positives, we considered 5%, 15%, and 25% as possible outcomes. We then calculated the incremental cost-effectiveness ratios for all nine combinations of these values. While this two-way sensitivity analysis became our primary analysis, we considered the center cell (106.5% additional yield, 15% false positive WATS3D) to be our base case for the purpose of one-way sensitivity analyses on other parameters and for ease of exposition. The Smith study also suggests a much higher prevalence of BE than previously thought, and this is supported by other data. While a Swedish study showed just 1.6% prevalence in the general population [26], a US study in first-time screening colonoscopy patients aged 40 or over found a 6.8% prevalence [27]. A mathematical model of the US population, aligned with data from the Surveillance, Epidemiology and End Results registry (SEER), arrived at an estimated prevalence of 5.6%, supporting a higher prevalence than previously thought [28]. A modeling study by Hur et al. estimated the prevalence of BE and the prevalence of GERD by decade of life [29]. Averaging the results for ages 50–59 and 60–69, adjusting to reflect the 5.6% prevalence in the USA, and combining this with the prior estimates of a relative risk of 6.0 for patients with chronic GERD symptoms, we calculated a prevalence of BE in white males age 60 with chronic GERD to be 18.7% [26, 28, 29]. Compared to the 8.85% estimated FB positives (see above), this suggests the true prevalence of BE in 60-year-old white males with GERD to be 111.3% greater than that detected by FB alone. Our base case of 106.5% additional yield, minus 15% false positives, would result in 90.5% true additional BE detected.

Surveillance

The cost and effectiveness of BE surveillance with treatment for LGD vs. natural history were taken from modeling studies [15]. There were two different models that looked at surveillance with treatment for LGD. One was developed by researchers from Erasmus University and the University of Washington (Erasmus/UW), and the other by researchers at Massachusetts General Hospital (MGH). These two models are described in the supplement for Kroep et al. and at the National Cancer Institute’s CISNET site [15]. These models followed non-dysplastic Barrett’s esophagus (NDBE) patients from age 60 to 100 for natural history and for surveillance with RFA for LGD. Cost was computed using 2015 US$, and effectiveness was measured in QALYs. The Erasmus/UW model estimated the additional cost and effectiveness of surveillance at $3733 and 0.2215 QALYs, while the MGH model estimated these parameters at $5255 and 0.2048 QALYs. We used the average of the two models and adjusted the cost to 2019 US$.

The Erasmus/UW and MGH models have estimated the number of cancers and cancer deaths for natural history and for surveillance with treatment for LGD for a cohort of 60-year-old males with newly diagnosed BE [15]. The MGH model estimated EAC incidence at 8.5% for natural history and 4.6% for surveillance with treatment, compared with 6.8% and 3.1% for the Erasmus/UW model. Thus, surveillance with treatment reduced EAC incidence by 45.9% and 54.4% for MGH and Erasmus/UW, respectively. For EAC-related death, MGH estimated 5.7% for natural history and 1.9% for surveillance with treatment, compared to 4.9% and 1.5% for Erasmus/UW. This corresponds to reductions of 66.7% and 69.4% for the two models. These rates of incidence and death, along with the reductions in both, due to surveillance vs. natural history, could then be multiplied by our model’s calculations of the number of additional BE cases in surveillance, as opposed to natural history due to the additional yield of WATS3D plus FB vs. FB alone. These results were expressed as the number of EAC cases and EAC deaths averted per 1000 screened. These numbers then were inverted to arrive at the number needed to screen overall using WATS3D plus FB to avert 1 additional cancer and 1 cancer death.

Costs

Cost of WATS3D and FB were based on Medicare reimbursement rates. WATS3D was based on 4 Current Procedural Terminology (CPT) codes: 88104, 88305, 88312, and 88361. The 88361 code was multiplied by 4 as there are usually 4 immunohistochemical (IHC) stains. The 2019 Medicare reimbursement total was $780. We assumed all patients were biopsied with forceps. While this assumption is unnecessary for FB costs as they are handled the same in both strategies and cancel out, this may overestimate the cost of WATS3D plus FB if in practice, not all patients would be biopsied. The cost for surveillance FB of false positive WATS3D included the facility charge (APC 5301/CPT 43239), physician charge (CPT 43239), as well as pathology charges (CPT 88305 + 88312) for each specimen jar. An unpublished analysis of Medicare claims data conducted for CDx Diagnostics by CodeMap found that on average 3.1 specimen jars were used for forceps biopsy. However, cases that were false positive WATS3D and true negative FB are likely to include many cases of very short length to biopsy, where fewer jars might be used. Nonetheless, we chose to be conservative and use 3.1 jars and bias results against use of WATS3D. We assumed surveillance of these false positives would be done at 3 and 6 years, following standard intervals for surveillance of NDBE.

Sensitivity Analysis

One-way sensitivity analysis was performed on all model parameters. Two-way sensitivity analysis was performed on the two key, uncertain WATS3D parameters: additional yield and false positive rate. Probabilistic sensitivity analysis was performed with each parameter modeled as a probability distribution, with 1000 trials to determine the proportion of the time each strategy was cost-effective for a range of willingness-to-pay values.

Probability and utility parameters were modeled as beta distributions, and cost parameters as gamma distributions. The base values were used for the means of these distributions, and the standard deviations were calculated based on using the low and high values of the one-way sensitivity analysis to represent a width of 4 standard deviations, akin to a 95% confidence interval [30]. The exception to the above was the parameter for the additional yield of WATS3D. The low and base values were chosen as being one-third and half of the value found in a recent large study, and the high value being the value from the study [19]. Accordingly, we used a triangular distribution with these three values as parameters (minimum, most likely, and maximum).

Results

Using WATS3D as an adjunct for screening resulted in a reduction in both the number of cancers and cancer deaths (Table 3). Screening 1000 patients would result in 3.0–3.1 fewer cancers and 2.7–3.0 fewer cancer-related deaths. This result corresponds with needing to screen 320–337 people for BE using WATS3D to avert 1 cancer and 328–367 to avert 1 cancer death.

Table 3 Results—cancer and cancer deaths averted

Use of WATS3D increased cost due to a combination of WATS3D during every initial endoscopy ($779.91), plus an additional $35.72 (per person originally screened) from the surveillance of false positive WATS3D (but FB negative) screens, and the additional cost of surveillance over natural history for the WATS3D true positive/FB false negative cases ($403.54). Thus, the WATS3D strategy costs an additional $1219.

The additional yield of WATS3D led to improved effectiveness, resulting in a mean of 0.017 QALYs per person screened. The incremental cost-effectiveness ratio (ICER) was $71,395/QALY—which is well below both cost-effectiveness thresholds of $100,000/QALY and $150,000/QALY. These results are summarized in Table 3.

In one-way sensitivity analysis, WATS3D plus FB remained cost-effective with all ICERS, remaining below $84,000/QALY for every combination of variables. Two-way sensitivity analysis of added yield of WATS3D plus FB and the rate of false positive WATS3D/FB negative is shown in Table 4. Using the $100,000/QALY threshold, WATS3D plus FB was cost-effective in 8 of 9 cells. The other cell barely eclipsed $100 K/QALY, at $105,224/QALY, far below the $150,000/QALY threshold.

Table 4 Incremental cost-effectiveness ratios (ICERs) for two-way sensitivity analysis

The probabilistic sensitivity analysis demonstrated that at a willingness-to-pay (WTP) threshold of $100,000/QALY, WATS3D plus FB was cost-effective in 98.7% of the simulated trials. This rose to 100% with a WTP of $150,000/QALY.

Discussion

Our results demonstrate that the adjunctive use of WATS3D with the FB-based Seattle protocol for screening 60-year-old white males with GERD for BE is more cost-effective than the Seattle protocol alone. The addition of WATS3D to the Seattle protocol results in approximately 3 fewer cancers and 3 fewer cancer deaths per 1000 people screened for BE. Furthermore, the combination of WATS3D plus FB remained more cost-effective than FB alone in multiple sensitivity analyses.

Although standard upper endoscopy with Seattle protocol forceps biopsies is the most common screening test used, other modalities have been tested as a method of improving the effectiveness of screening and minimizing the invasiveness of the procedure. These include transnasal endoscopy, esophageal capsule endoscopy, Cytosponge® (Medtronic GI Health, Sunnyvale, CA), tethered capsule endomicroscopy, and electronic nose device [31]. In the recent Clinical Practice Update by the American Gastroenterological Association, Spechler and colleagues recommended against the use of any these tests to screen for BE [31]. However, WATS3D, which has been demonstrated to increase the detection of BE in screening populations, and dysplasia and early EAC in surveillance populations, was recently endorsed as an adjunct to the Seattle protocol when evaluating patients with suspected or known BE in the recently published Guidelines of the American Society for Gastrointestinal Endoscopy (ASGE) [4]. Our results, which demonstrate WATS3D is cost-effective in screening patients with GERD, can have a dramatic impact on reducing the costs associated with EAC by identifying a greater number of BE patients. These patients could then be enrolled in surveillance programs to detect dysplasia and early EAC, and as appropriate, undergo minimally invasive, curative procedures that preempt the development of esophageal cancer.

Screening for BE is of benefit only if it is coupled with an effective surveillance program. Although endoscopic BE surveillance remains controversial due to a number of reasons, including the limitations of the Seattle protocol [32], observational and retrospective studies suggest that patients with EAC identified during a BE surveillance program have markedly improved survival compared to those who have not undergone surveillance [33,34,35]. Furthermore, surveillance of BE patients with subsequent endoscopic eradication therapy for patients identified with dysplasia has been shown to be cost-effective [14, 15]. WATS3D has been demonstrated to significantly decrease sampling error and therefore, increase detection of dysplasia in patients under surveillance [18, 19], and reduce the poor inter-observer agreement rate among even expert gastrointestinal pathologists in the diagnosis and grading of dysplasia [36]. Incorporating WATS3D into a surveillance program can have a significant positive effect on the health economic aspects not only of BE screening, but also surveillance. Additional studies are needed to confirm this.

There are a number of limitations of using the MGH and Erasmus/UW model results for the incremental cost and effectiveness of surveillance and treatment for LGD versus natural history of BE. The model results we leveraged were specifically for 60-year-old males with NDBE entering surveillance. This required the simplifying assumption that all positive screens were positive for NDBE and not dysplastic BE. In reality, perhaps as many as 10% may have had LGD on initial endoscopy, and a very small number of patients may have had HGD/EAC. As studies have shown that the added yield for WATS3D plus FB over FB alone is even greater in the detection of dysplasia than for BE [19, 20], and relaxing this assumption would only make WATS3D plus FB even more cost-effective. Use of these models also requires that surveillance results be similar for 60-year-old white males versus all males. In fact, these two groups are largely the same. Whites make up about 3-quarters of 60-year-old males, and due to their increased risk of BE, they make up an even greater proportion of 60-year-old males with BE [37, 38]. Although minorities have a significantly lower prevalence of BE compared with whites, the progression from BE to EAC does not appear to vary by race or ethnicity. 36 Since life expectancy is nearly identical for 60-year-old white males versus all males (21.8 years vs. 21.7 years), this is also unlikely to create any significant bias [39].

Screening at age 50 would add an additional 10 years of surveillance, with a greater number of cases of cancer detected at younger ages, saving even more years of life. The more sensitive strategy of WATS3D plus FB would likely be even more cost-effective. The MGH and Erasmus/UW models also make no mention of stopping surveillance at any age, with many cancers detected at older ages, where they incur the high cost of cancer but die of other causes. Stopping surveillance at age 80, for example, might lead to a large reduction in cost with a smaller reduction in life years saved. The MGH and Erasmus/UW models also assumed costs of cancer treatment that may be substantially underestimated. A National Cancer Institute analysis of health care claims data found cancer costs nearly double that used by the CISNET models, after adjusting costs to the same period, 2015 $US [40]. However, an ad hoc analysis showed that it had minimal impact on resulting ICERs.

While our study accounted for the uncertainty of the added yield and false positive rate for WATS3D with no change in our conclusion, these results can be revisited in the future after longitudinal studies have been conducted to ascertain stable estimates of these parameters.

In summary, we demonstrate that the adjunctive use of WATS3D with the Seattle protocol using forceps biopsies is cost-effective when screening for BE, potentially reducing the morbidity and mortality associated with EAC in patients with GERD symptoms.