Introduction

Gastroesophageal reflux disease (GERD) is a condition that develops when stomach content reflux causes troublesome symptoms and/or complications [1]. GERD demonstrated an increasing incidence rate worldwide in recent years [2]. Various guidelines recommend proton pump inhibitor (PPI) therapy as the mainstay of GERD treatment [3, 4]. However, several patients experience no improvement in heartburn symptoms even after PPI treatment; hence, physicians frequently need to deal with this condition in their daily practice [5]. In patients with PPI-refractory heartburn, and without previous evidence of reflux-related pathology on endoscopy or ambulatory reflux monitoring (unproven GERD), it is recommended that PPIs be discontinued and a combined multichannel intraluminal impedance–pH (MII–pH) test be performed [6, 7]. This test accurately distinguishes between patients with abnormal acid exposure time (AET) [true non-erosive reflux disease (NERD)], those with reflux hypersensitivity (RH) and those with functional heartburn (FH), and allows for appropriate treatment selection [4, 6, 7].

Vonoprazan is a potassium-competitive acid-secretion inhibitor, a new class of acid-secretion inhibitors that have become available in recent years. It demonstrates stronger acid-suppressive effects than conventional PPIs, and several prospective studies have reported higher endoscopic cure rates with vonoprazan compared to conventional PPIs for severe reflux esophagitis (Los Angeles classification grade C and D) [8,9,10]. Further, previous studies reported that vonoprazan effectively controls AET and improves symptoms in PPI-refractory GERD patients [11], and that heartburn symptoms in patients with vonoprazan-refractory heartburn were not acid reflux related, due to its strong inhibitory effect [12,13,14,15]. However, these reports included a small number of patients who underwent an MII–pH test with continuous vonoprazan therapy, including those with erosive esophagitis and esophageal motility disorders (EMDs), and the effects of vonoprazan in patients with unproven GERD remain unclear [12,13,14]. This study aimed to evaluate the effect of vonoprazan in patients with unproven GERD by comparing patients with vonoprazan-refractory heartburn with those with PPI-refractory heartburn.

Materials and Methods

Study Design and Participants

This retrospective cohort study was conducted under the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines [16]. This study included consecutive patients who presented with heartburn symptoms despite standard-dose vonoprazan (20 mg) or standard-dose PPIs (lansoprazole of 30 mg, rabeprazole of 20 mg, and esomeprazole of 20 mg) for at least 8 weeks from September 2013 to November 2023, and with no evidence of erosive esophagitis and Barrett’s esophagus (> 1 cm of columnar-lined esophagus) on endoscopy (unproven GERD) and underwent MII–pH monitoring at Chiba University Hospital. Patients with PPI-refractory heartburn were defined as those who had not previously taken vonoprazan. On the other hand, patients with vonoprazan-refractory heartburn were not asked about their previous history of taking PPIs. Heartburn symptoms were confirmed with the frequency scale for the symptoms of GERD (FSSG) questionnaire and gastroesophageal reflux disease questionnaire (GERD-Q) [17, 18]. Patients were categorized into vonoprazan-refractory and PPI-refractory heartburn, and differences in their symptoms and test results were investigated. The Ethics Committee at Chiba University Hospital reviewed and approved this study, conducted in compliance with the Helsinki Declaration (approval number 3873). Informed consent was obtained from all patients to undergo the procedures involved.

Questionnaire

All participants completed four questionnaires. The questionnaires were answered on the day of HRM and MII–pH testing, under vonoprazan/PPI discontinuation. The FSSG questionnaire consists of 12 items categorized into two domains. Each item was scored from 0 (never) to 4 (always). Dysmotility-like symptoms (dyspepsia score), the first domain, were calculated by summing the scores (range 0–28) for items 1, 4, 6, 7, 9, 10, and 12. Acid reflux-related symptoms (reflux score), the second domain, involved summing the scores (range 0–20) for items 2, 3, 5, 8, and 11 [17]. The GERD-Q is a self-reported questionnaire designed to diagnose GERD, comprising six items. Each item was rated on a 4-point scale. The scoring for items 1, 2, 5, and 6 is as follows: 0 points for no symptoms, 1 point for symptoms on 1 day, 2 points for 2–3 days, and 3 points for 4–7 days. The scoring for items 3 and 4 is reversed: 3 points for no symptoms, 2 points for 1 day, 1 point for 2–3 days, and 0 points for 4–7 days. Patients reported their symptoms over the previous week [18]. The Gastrointestinal Symptom Rating Scale (GSRS) questionnaire comprises 15 items and measures 5 categories of gastrointestinal symptoms, including reflux, abdominal pain, indigestion, diarrhea, and constipation. Values for each of the category scores were calculated as the mean of the respective items. The total GSRS score was calculated as the mean of all 15 items. The scoring is based on a Likert scale from 1 (minimal gastrointestinal symptoms) to 7 (most severe symptoms) points [19]. The Short Form-8 (SF-8) is a measure of health that generates physical and mental component summaries. Participants were asked eight questions about their health during the past 4 weeks to measure two scores, both between 0 and 100. The higher the score, the better the health state [20].

MII–pH Testing

The MII–pH test was performed after discontinuing vonoprazan/PPI medication for > 7 days. After an overnight fast, the MII–pH catheter was inserted for 24 h. The catheters included Diversatek (formerly Sandhill Scientific, Boulder, Colorado, USA). MII–pH was used for diagnosis in accordance with the Lyon consensus [21, 22]. The MII–pH data were manually analyzed by three experienced investigators (M.S., T.M., and H.D.) using the BioView™ analysis software (Sandhill Scientific, Inc.). Patients were classified into three groups: true NERD, RH, and FH. AET > 6% was defined as true NERD, and AET of 4–6% with the presence of other impedance parameters that support GERD was also defined as true NERD. The other impedance parameters supporting GERD included excess total number of physiological reflux episodes (> 80), low post-reflux swallow-induced peristaltic wave (PSPW) index (≤ 50%), and low mean nocturnal baseline impedance (MNBI, ≤ 1500 Ω). RH was defined as AET ≤ 6%, not belonging to the above criteria, and a positive symptom index (> 50%) and/or positive symptom association probability (> 95%). FH was defined as AET ≤ 6%, not belonging to the above criteria, and negative for both symptom index and symptom association probability [21, 22]. The PSPW was defined as a swallow that occurs within 30 s of the end of a reflux episode that triggers an antegrade 50% drop in impedance relative to the pre-swallow baseline, originating in the most proximal impedance site and reaching all distal impedance sites [23]. The PSPW index was calculated by dividing the number of PSPWs by the number of total refluxes, TRs [24]. The MNBI was evaluated from the most distal impedance channel during night-time recumbence without reflux episodes, swallows, or pH drops [21, 25]. The mean of three measurements, each lasting at least 30 min, was manually calculated to obtain the MNBI [21].

Endoscopic Findings

All patients underwent esophagogastroduodenoscopy to confirm the absence of erosive esophagitis and Barrett’s esophagus. Erosive esophagitis was classified as A–D according to the Los Angeles classification [26], and Barrett’s esophagus was identified as the replacement of normal epithelium with columnar epithelium in the distal esophagus between the esophagogastric junction (EGJ) and the squamocolumnar junction (SCJ), as assessed by the Prague classification system [27]. The present study excluded patients with confirmed erosive esophagitis and Barrett’s esophagus (> 1 cm of columnar-lined esophagus) as patients with proven GERD.

High-Resolution Manometry (HRM)

The HRM systems used included the Diversatek system (Boulder, Colorado, USA) with 32 pressure sensors and 16 impedance sensors. EMD diagnosis followed the criteria set by the Chicago Classification System, version 4.0 [28]. Patients with disorders of EGJ outflow and hypercontractile esophagus were excluded. The esophagogastric junction contractile integral (EGJ-CI) serves as an index to evaluate the barrier function of the EGJ. It is calculated using the distal contractile integrals (DCI) box, comprising the lower esophageal sphincter and crural diaphragm over three respiratory cycles, thereby maintaining a threshold above gastric pressure. The resulting ‘DCI’ is then divided by the duration of these three respiratory cycles and expressed in mmHg cm units [21].

Statistical Analysis

Categorical variables were expressed as n (%) and compared using the χ2 (chi-square) test; comparisons of True NERD, RH, and FH proportions were made by χ2 (chi-square) test with Bonferroni correction. Continuous variables were expressed as mean ± standard deviation or median and interquartile range (25th–75th percentile). Student’s t-test or Mann–Whitney U test were used to analyze these variables, respectively. All statistical analyses were performed using Statistical Package for the Social Sciences version 23 (SPSS, Inc., Chicago, IL). All authors had access to the study data and reviewed and approved the final manuscript.

Results

Study Flow Diagram and Patient Demographics

Figure 1 shows the flowchart of patient enrollment. MII–pH examinations were performed from September 2013 to November 2023 on 168 patients with vonoprazan- or PPI-refractory heartburn to investigate the cause. Of these, 26 patients with erosive esophagitis and 5 patients with EMDs (Disorders of EGJ outflow in 3 patients and hypercontractile esophagus in 2 patients) were excluded, and of the remaining 137 patients, 104 patients with unproven GERD who discontinued vonoprazan/PPI and underwent MII–pH testing were included. Table 1 shows the patient backgrounds of 52 patients with vonoprazan-refractory and 52 with PPI-refractory. Age, gender, and body mass index did not differ between the vonoprazan and PPIs groups. The vonoprazan group demonstrated a slightly higher proportion of males. FSSG questionnaire scores were significantly higher in the vonoprazan group for all total (p = 0.012), reflux (p = 0.014), and dyspepsia scores (p = 0.026). GSRS scores were significantly higher in the vonoprazan group for total (p = 0.005), reflux related (p < 0.001), abdominal pain (p = 0.011), and diarrhea scores (p = 0.013). Mental component scores on the SF-8 questionnaire were lower in the vonoprazan group but with no significant difference (p = 0.062).

Fig. 1
figure 1

Flowchart of patient enrolment. MII–pH combined multichannel intraluminal impedance–pH, PPI proton pump inhibitor, EMD esophageal motility disorder, EGJ esophagogastric junction

Table 1 Clinical characteristics of the patients

HRM Results Between Patients with Vonoprazan-Refractory and Those with PPI-Refractory Heartburn

Table 2 shows the HRM and MII–pH test results. Ineffective esophageal motility was observed in 21.2% of the vonoprazan group and 23.1% of the PPIs group. The mean DCI or EGJ-CI demonstrated no difference.

Table 2 Differences between vonoprazan-refractory and PPI-refractory heartburn patients

MII–pH Test Results Off Vonoprazan/PPI Therapy

The MII–pH test revealed no difference in AET values between the vonoprazan group (1.7% [0.2–8.4]) and the PPIs group (1.5% [0.5–5.2]). There were also no significant differences between the two groups regarding acid reflux and non-acid reflux episodes, or the percentage of acid reflux and non-acid reflux episodes in total reflux episodes. Other parameters demonstrated no significant differences between the two groups. The rates of true NERD, RH, and FH were 40.4%, 17.3%, and 42.3% in the vonoprazan group, and 26.9%, 17.3%, and 55.8% in the PPIs group, respectively (Fig. 2). The vonoprazan group demonstrated a higher true NERD rate, but with no significant difference (p = 0.307). AETs in patients with true NERD were not significantly different at 8.9% (6.3–14.0) and 9.0% (6.2–11.6) in the vonoprazan and PPIs groups, respectively (p = 0.564).

Fig. 2
figure 2

MII–pH test results off Vonoprazan/PPI therapy. MII–pH combined multichannel intraluminal impedance–pH, PPI proton pump inhibitor, NERD non-erosive reflux disease

MII–pH Test Results on Vonoprazan/PPI Therapy After Proven GERD

Among the vonoprazan group, 8 of the 21 patients with true NERD underwent repeat MII–pH test with vonoprazan resumed. The AET values after vonoprazan discontinuation and under oral vonoprazan were 10.4% (6.9–28.3) and 0.0% (0.0–0.3), respectively, and all AET values under oral vonoprazan were normal (Fig. 3). Seven of the eight patients were negative for symptom indexes, and one was positive and diagnosed with GERD overlap RH.

Fig. 3
figure 3

MII–pH test results in Patients with True NERD off Vonoprazan and on Vonoprazan therapy. AET acid exposure time

Discussion

Identifying the cause of the symptoms and selecting appropriate treatment is important in managing patients with unproven GERD with PPI-refractory heartburn. Our study revealed that patients with vonoprazan-refractory heartburn were more symptomatic, and had more functional dyspepsia (FD) and irritable bowel syndrome (IBS) symptoms than patients with conventional PPI-refractory heartburn. Further, heartburn symptoms during vonoprazan administration were not acid reflux related.

Generally, FD and IBS are frequently associated with GERD [29]. A previous study revealed that 31.4% of patients with GERD had FD and 29.5% had IBS, and these complications were associated with a significantly lower quality of life [29]. The present study revealed significantly stronger FD and IBS symptoms in patients with vonoprazan-refractory heartburn than in those with PPI-refractory heartburn. The absence of abnormal acid reflux in the MII–pH tests under continuous vonoprazan administration after the diagnosis of proven GERD indicates that a combination of FD and IBS symptoms may influence residual symptoms in patients with vonoprazan-refractory heartburn. Therefore, asking about FD and IBS symptoms and treating them accordingly may be useful for patients with vonoprazan-refractory heartburn.

Several reports have described the characteristics of patients with vonoprazan-refractory GERD. Okuyama et al. compared 26 patients with vonoprazan-refractory with 28 patients with PPI-refractory heartburn and revealed the combination of FD symptoms and insomnia as factors in the vonoprazan-refractory group [12]. Further, the factors causing vonoprazan resistance were the combination of FD symptoms and insomnia. They included 15 patients with erosive esophagitis (28% of the total), did not exclude EMDs and did not perform MII–pH testing, but their paper still supports our results. Masaoka et al. performed MII–pH testing in 16 patients with vonoprazan-refractory and 11 patients with PPI-refractory heartburn (including one case each of erosive esophagitis) under vonoprazan/PPI administration and revealed lower AET values in the vonoprazan-refractory groups than in the PPI-refractory groups [13]. Hamada et al. retrospectively compared 39 patients having vonoprazan-refractory with 34 patients having PPI-refractory heartburn, including two cases of erosive esophagitis, and revealed that MII–pH performed under continuous vonoprazan/PPI caused no abnormal acid reflux (AET of > 4%) in the vonoprazan group. However, this included 22 patients (30.1% of the total) with EMDs [14]. Some studies reported patients with vonoprazan-refractory as described above, but they were conducted in patients with a mixture of ‘proven’ and ‘unproven’ GERD and EMDs. In contrast, our study excluded patients with EMDs and proven GERD and included only those patients with unproven GERD. Proven GERD is less complicated to treat in practice because GERD has been proven in the past. For patients with unproven GERD, MII–pH testing is recommended after discontinuation of PPI therapy [6, 7]. This method of obtaining results is one of the strengths of our study, which was conducted after discontinuing vonoprazan/PPI. Our study revealed that true NERD was more common in the vonoprazan-refractory group than in the PPI-refractory group, although with no significant difference. This may be because MII–pH results retested under vonoprazan showed normal AET (0.0% [0.0–0.3]) in all cases, indicating a higher proportion of patients with true GERD, rather than inadequate acid-secretion suppression.

This study has several limitations. The first is its retrospective design. We did not simply compare patients having vonoprazan-refractory with those having PPI-refractory heartburn, and the choice of PPI versus vonoprazan prescription was made by the attending physician and not randomized. In addition the vonoprazan-refractory group included patients who had previously taken PPIs. Although there were no differences in patients’ backgrounds or various laboratory results between patients with and without a history of PPI use (Supplementary Table 1), selection bias might have been included. Second, the questionnaires were filled out after discontinuation of the vonoprazan/PPI, therefore, the effect of drug discontinuation might have affected the results. Third, it was not possible to determine the timings of symptom onsets for FD and IBS. Therefore, we cannot completely rule out the possibility that the onset of symptoms occurred after administration of the vonoprazan/PPI. Fourth, the MII–pH test was not repeated in all patients who had a true NERD diagnosis under vonoprazan/PPI discontinuation. Therefore, it is not possible to rule out that all residual heartburn symptoms under vonoprazan are not acid reflux-related. Hence, further investigation is required.

In summary, we have identified differences and characteristics between patients with unproven GERD with vonoprazan-refractory and those with PPI-refractory heartburn. Our results may be useful for subsequent treatment decisions in daily practice.