Introduction

In patients with systemic sclerosis (SSc), the gastrointestinal (GI) tract is the most common internal organ involvement, with over two thirds of patients reporting GI symptoms [1]. SSc GI tract involvement is a major cause of serious morbidity, affecting health-related quality of life (HRQoL) and survival of these patients [2, 3]. The most prevalent GI manifestation is esophageal involvement due to hypomotility and gastroesophageal reflux, the latter often leading to esophagitis and in later stages to Barrett’s esophagus [4]. Another GI manifestation of SSc is gastric antral vascular ectasia (GAVE) which may cause severe anemia [5]. To date, there are no recommendations or guidelines when to perform endoscopic and functional investigation of the upper GI tract in patients with SSc. Esophago-gastro-duodenoscopy (EGD) plays a major role in the diagnosis of reflux esophagitis, esophageal strictures, Barrett’s esophagus, and adenocarcinoma of the esophagus.

The University of California at Los Angeles Scleroderma Clinical Trial Consortium GIT 2.0 instrument (UCLA GIT 2.0) is a patient-completed questionnaire validated to assess GI symptoms severity and related HRQoL in SSc [6]. Originally developed in English, and with a minimal clinically important difference previously determined [6], it has been validated in different languages [7,8,9,10,11]. Several clinical trials of GI treatments in patients with SSc already used this instrument as an outcome measurement [12,13,14]. The UCLA GIT 2.0 is an excellent candidate to guide the need for further investigation of the GI tract by endoscopy and/or functional tests. Constructed to reflect the burden of GI symptoms including reflux, it is attractive to hypothesize that it is able to identify patients with endoscopic esophagitis or other clinically significant findings on EGD.

In this study, we aimed to determine, in an unselected, real-life cohort of patients with SSc, whether the UCLA GIT 2.0 could discriminate patients for whom a rheumatologist with experience in SSc would recommend an EGD, and if the UCLA GIT 2.0 could identify patients at risk for endoscopic esophagitis or other clinically significant EGD findings.

Methods and patients

Study population

For this observational, post hoc analysis of prospectively collected data from the SSc cohort of the University Hospital Zurich, we selected patients who were included in the European Scleroderma Trials and Research Group (EUSTAR) database, fulfilled the ACR/EULAR 2013 criteria for the classification of SSc, and completed at least one UCLA GIT 2.0 questionnaire. Our center is following the EUSTAR recommendations for a detailed annual assessment, based on a standardized clinical approach and work-up [1]. Patients also complete additional questionnaires at their annual visits as part of that routine assessment, including the UCLA GIT 2.0. Investigations of the GI tract, such as EGD, are not included in the routine assessment and are selectively recommended by the expert rheumatologist, after taking the history, performing the clinical examination of the patients, and evaluating their same-day work-up results (laboratory, lung function tests, lung imaging, electrocardiogram, and power-Doppler echocardiography). There was no regular use of the UCLA GIT 2.0 questionnaire to decide further GI investigation, although the rheumatologist could have access to the patient self-reported data, at least in part of the cases.

Data were retrieved from the prospectively collected EUSTAR registry for our center. In the EUSTAR database, information on gastrointestinal involvement is recorded by 3 items: esophageal symptoms (reflux and/or dysphagia), stomach symptoms (early satiety and/or vomiting), and intestinal symptoms (diarrhea, bloating, and/or constipation). To collect more detailed data on upper GI symptoms, presence of EGD, and treatment with proton pump inhibitors (PPI), we additionally reviewed retrospectively the electronic medical records (EMR) of the selected patients (see details in the online supplement). We also recorded the attending rheumatologist’s indication to perform an EGD from each visit of the patient. As some patients had more than one EGD, we selected for further analysis the EGD performed within a period of up to 3 months before or after the corresponding EUSTAR assessment visit and, if more than one EGD, the one closest to the corresponding visit. Reflux esophagitis was graded according to the Los Angeles classification [15]. Patients with concomitant acute GI bleeding or a history of cancer in the upper GI tract were excluded from this study. The study has been performed in accordance with the Declaration of Helsinki Ethical Principles and with GCP guidelines. Ethical approval for this data collection and analysis was issued by the cantonal ethics (BASEC Nr. PB2016-01515 and 2018-02165).

UCLA GIT 2.0 questionnaire and study outcomes

The UCLA GIT 2.0 questionnaire contains 34 items, organized into seven subscales: reflux, distention/bloating, diarrhea, fecal soilage, constipation, emotional wellbeing, and social functioning. The subscales are scored from 0 to 3, higher scores indicating more severe symptomatology and worse HRQoL. Scoring of the diarrhea and constipation scales is different, ranging from 0 to 2 and 0 to 2.5, respectively. The total UCLA GIT 2.0 score is calculated by averaging all subscales, except the one for constipation, and ranges from 0 to 2.83 [6, 7].

We defined three study outcomes: first, the recommendation to perform EGD by the SSc-specialized rheumatologist; second, macroscopic esophagitis identified on EGD (based on the EGD report and mentioning the Los Angeles grade of esophagitis), further referred to as “endoscopic esophagitis”; and third, any significant pathologic finding on EGD, further referred to as “pathologic EGD.” The latter included endoscopic esophagitis, mycotic esophagitis, esophageal strictures, Barrett’s esophagus, gastric antral vascular ectasia (GAVE), peptic ulcers, and tumors.

Statistical analysis

For statistical calculations, we used the statistic software IBM SPSS 25.0 and R language 3.6 (lme4 package) [16]. A p value < 0.05 was considered statistically significant. Numeric variables are described as median and inter-quartile range (Q1, Q3), while categorical variables are described as n and percentage. Comparisons between groups were performed with the chi-squared test for categorical variables and with the Mann-Whitney U test for numeric variables.

The parameters of interest for all three study outcomes were the UCLA GIT 2.0 total score and its reflux, distention/bloating, social functioning, and emotional wellbeing subscales. We analyzed their association with each of the three dichotomous outcomes of the study using multivariable generalized linear mixed effects models (GLMM) adjusted for random effects of subjects and fixed effects for all other candidate parameters mentioned. For the first outcome (recommendation to perform EGD), we excluded patients who had performed EGD during the last 3 months before their visit to our center, considering that in most of these patients a new EGD would not be recommended again at the assessment.

The following parameters, which potentially influence the study outcomes (further referred as “covariates”) were selected by the authors based on clinical experience and evidence from published literature: age, sex, disease duration, cutaneous subset of SSc (diffuse vs. any other subset) [17], modified Rodnan skin score (mRSS), body mass index (BMI), hemoglobin (Hb), erythrocyte sedimentation rate (ESR), forced vital capacity (FVC), PPI therapy, gastro-esophageal symptoms as retrieved from the charts of the patients (heartburn, regurgitation, dysphagia, and vomiting), “esophageal symptoms” as recorded in the EUSTAR database (reflux and/or dysphagia), and “stomach symptoms” as recorded in the EUSTAR database (early satiety and/or vomiting).

For the outcome “recommendation to perform EGD,” we performed the following GLMM models using as covariates: 1. age, sex, disease duration, mRSS, SSc subset, BMI, Hb, ESR, FVC, and PPI therapy, which were included in all the other models; 2. gastro-esophageal symptoms, as collected from the patient charts (heartburn, regurgitation, dysphagia, vomiting); 3. “esophageal symptoms” and “stomach symptoms” as recorded in EUSTAR database; and in models 4 to 8, one of the selected subscales of UCLA GIT 2.0 (reflux, distention/bloating, social functioning, emotional wellbeing) or the UCLA GIT 2.0 total score, respectively.

For the outcomes “endoscopic esophagitis” and “pathologic EGD,” anticipating that the number of EGD will be less than one third than the number of visits with a completed UCLA GIT 2.0 questionnaire, we reduced, by clinical judgment, the number of covariates included in the multivariable analysis. Consequently, all GLMM models for these outcomes included only four independent variables, selected by clinical judgment: age, sex, disease duration, and PPI therapy. Further GLMM models included these four parameters and one of the following parameters, or group of parameters: mRSS (model 2), Hb (model 3), gastroesophageal symptoms reported by the patient: heartburn, regurgitation, and dysphagia, as collected from EMR (model 4), “esophageal symptoms” and “stomach symptoms” (model 5), and one of the selected subscales of UCLA GIT 2.0 or the UCLA GIT 2.0 total score, respectively (models 6 to 10).

We further identified by receiver operating characteristic (ROC) curve analysis, selecting the values with the largest area under the curve (AUC) and significant 95% confidence intervals (95% CI), cutoffs for the reflux, and total UCLA GIT 2.0 score discriminating best between patients with recommendation to perform EGD and those without. Based on the AUC, the accuracy of the prediction model can be considered excellent (0.9–1.0), good (0.8–0.9), fair (0.7–0.8), or poor (0.6–0.7).

Results

Patients and baseline characteristics

Out of 494 patients in the database, 346 were fulfilling the inclusion criteria. For these, 940 visits with a completed UCLA GIT 2.0 questionnaire were available. The median number of visits per patient was 2 (Q1, Q3: 1–4), with 89/346 patients having one visit. Median follow-up time was 3.4 years (Q1, Q3: 1.8–4.9).

The demographic and clinical data of the patients are displayed in Table 1. The majority of participants were female (82.4%) and Caucasian (94.5%), 23% had the diffuse cutaneous subtype of SSc, with a median age of 63 years and a median disease duration of 10 years. Nine out of 343 patients had a history of Barrett’s esophagus. Of 346 patients, 261 patients (75.4%) reported GI symptoms and 311/346 patients (89.9%) had UCLA GIT 2.0 scores > 0 in at least one visit, GI symptoms recorded from the patients’ charts and UCLA GIT 2.0 scores (median and interquartile range) are displayed in Table 2. The reflux and distention/bloating subscales and the total score of UCLA GIT 2.0 had medians of 0.25, 0.50, and 0.22, respectively, while the medians of the other subscales were zero. Approximately 10% of the components of the UCLA GIT 2.0 questionnaire were missing overall. Of 940 visits, treatment with PPI was present in 588 (62.6%) visits at the time of completing the UCLA GIT 2.0 questionnaire at the annual assessment.

Table 1 Baseline demographic and clinical characteristics of the study cohort
Table 2 Gastrointestinal symptoms and scores of the UCLA GIT 2.0 and its subscales in all visits, N = 940

Evaluation of the UCLA GIT 2.0 as a potential decision-aiding instrument for EGD

Of 940 visits with completed UCLA GIT 2.0 questionnaires, 31 were excluded from this part of the analysis because patients had an EGD within 3 months before the visit. In the 909 remaining visits, EGD was recommended in 169, of which 120 were carried out (Figure S1 in the online supplement). Patients with a recommendation for EGD had significantly more frequent heartburn, dysphagia, and regurgitation, a history of Barrett’s esophagus, as well as higher mRSS scores and erythrocyte sedimentation rates; they also had significantly higher values of the UCLA GIT 2.0 score and all its subscales except the subscale for fecal soilage (Table 3).

Table 3 Comparison of patient data from visits in which EGD was recommended (n = 169) vs. data from visits in which EGD was not recommended (n = 740)

We next aimed to identify independent parameters associated with the expert recommendation to perform EGD. We found in multivariable GLMM models that mRSS, individual gastroesophageal symptoms (heartburn, dysphagia, and regurgitation, respectively) and upper gastrointestinal tract symptoms as recorded in the EUSTAR database (“esophageal symptoms” and “stomach symptoms”), significantly associated with the recommendation to perform EGD. Except the emotional wellbeing subscale, all the examined subscales of UCLA GIT 2.0, as well as the total score, correlated significantly with the recommendation to perform EGD (Table 4).

Table 4 Factors associated with referral to EGD (multivariable generalized linear mixed effects models, GLMM). Statistically significant results are highlighted in bold font

To identify optimal cutoffs for the reflux and total UCLA GIT 2.0 score, discriminating best between patients with recommendation to perform EGD and those without, we performed ROC analysis. For the reflux subscale, the best results were found for the cutoff of 0.163 (AUC [95% CI] of 0.64 [0.60–0.68]), with a sensitivity of 73% and specificity of 50%. Similarly, for the total UCLA GIT 2.0 score, we identified the optimal cutoff of 0.161, with an AUC [95%CI] of 0.64 [0.59–0.68], sensitivity 78%, and specificity 46%. As the range for these scores is 0–3 and 0–2.83 respectively, this shows that even patients with a low symptom burden have been referred to further evaluation by EGD.

Evaluation of the UCLA GIT 2.0 as a potential predictor of endoscopic esophagitis and pathologic EGD

Of all 346 patients, 241 had undergone EGD at least once during the entire observation period. We identified 177 EGD matching the inclusion criteria, performed in 145 patients.

Of these, 128 were performed on indication from the SSc-expert rheumatologist of our center, and 49 were performed on indication from another physician, of which 31 were done during the 3 months preceding the visit (Figure S2 in the online supplement). A single EGD was performed in 118 patients, 22 patients had undergone two EGDs, and five patients had undergone three EGDs. The median time between the visit and the corresponding EGD was 2 days (Q1, Q3: − 0.5, 36), with a mean of 9.7 days.

Esophagitis was found in 52/177 EGD (in 50 patients), GAVE in 15/177 EGD (in 12 patients), and biopsy-verified Barrett’s esophagus in 24 /177 EGD (in 19 patients). Other EGD findings were fungal esophagitis in 7, esophageal strictures in 2, peptic ulcers of the stomach or bulbus duodeni in 3, and gastritis in 6 EGD, leading to a total of 94/177 pathologic EGD.

Patients with endoscopic esophagitis had significantly more frequently EUSTAR reported esophageal symptoms (“reflux and/or dysphagia”) and slightly higher mRSS scores, while the distribution of individual upper gastrointestinal tract symptoms (heartburn, dysphagia, and regurgitation), as well as that of the UCLA GIT 2.0 score and subscales, did not reach statistical significance (Table 5). Patients with esophagitis also tended to be less frequently under treatment with PPI (52.7% vs. 72.4%, p = 0.057) while, surprisingly, they had slightly but significantly higher Hb values vs. patients without esophagitis (median Hb value13.6 g/dl vs 12.9 g/dl, p = 0.008).

Table 5 Comparison of patient data from visits in which EGD detected esophagitis (n = 52), respectively did not detect esophagitis (n = 125)

We next wanted to analyze whether clinical parameters can be identified that are independently associated with the presence of esophagitis or other pathologic GI tract findings. In multivariable GLMM analysis on the outcome of endoscopic esophagitis, mRSS and EUSTAR reported esophageal symptoms (“reflux and/or dysphagia”) were the only parameters associated with endoscopic esophagitis; however, the associations were very weak (with an OR of only 1.1 for mRSS and a low AUC of 0.61 for esophageal symptoms) (Table 6). Hemoglobin correlated with endoscopic esophagitis in the univariable model, but not in the multivariable model. The UCLA GIT 2.0 total score and its subscales showed no association with endoscopic esophagitis. Similar negative results were obtained in the GLMM analysis for the outcome of pathologic EGD (Table S1 in the online supplement), suggesting that in our real-life cohort, the UCLA GIT 2.0 failed to identify patients with EGD findings.

Table 6 Factors associated with esophagitis on EGD (multivariable linear mixed effects models, GLMM)

Discussion

To our best knowledge, this is the first study to analyze the performance of the UCLA GIT 2.0 in a large real-life cohort of unselected patients with SSc. Our results show that the UCLA GIT 2.0 score and its reflux subscale identified patients with SSc, in whom EGD was recommended by experts, with a sensitivity of over 70% and a specificity of about 50%.

The recommendation for EGD was made in all patients by a rheumatologist with experience in SSc, at the annual visit of the patient and following a comprehensive investigation, as defined by the EUSTAR guidelines [1]. There was no regular use of the UCLA GIT 2.0 questionnaire to decide further GI tract investigation. We excluded patients with concomitant acute GI bleeding or a history of cancer in the upper GI tract, as in these cases, the indication for EGD would be driven by other criteria than the symptoms captured by the UCLA GIT 2.0.

Considering that several clinical or laboratory data might influence the indication of EGD, or might predict EGD findings, we adjusted the analyses for all these parameters, which were included as covariates for the GLMM models after a careful selection based on clinical judgment and evidence from published literature. For example, we expected the recommendation for EGD to be favored by anemia, possibly caused by gastrointestinal bleeding, which is frequent in SSc, especially in the presence of GAVE [5]; however, our data did not show any association between Hb and the referral to EGD. On the other hand, mRSS was significantly associated with the recommendation for EGD, but the very small OR suggests that this association is of little clinical significance. As expected, patients with a history of Barrett’s esophagus were more frequently referred to EGD. We did not have enough data on significant weight loss or decrease in Hb, and no data on other objective markers of GI involvement, such as F-calprotectin, to include these among the selected covariates.

The recommendation to perform EGD was significantly associated with higher UCLA GIT 2.0 reflux, distention/bloating, and social functioning subscale scores, as well as with higher total scores. As expected, we found similar significant associations for individual symptoms like heartburn, dysphagia, and regurgitation, as well as for these symptoms clustered together as esophageal symptoms and stomach symptoms. These results support the use of the UCLA GIT 2.0 questionnaire in practice, as it provides the attending rheumatologist with detailed information on gastrointestinal symptoms and helps orientating the further investigation of the GI tract.

In the second part of the study, we analyzed the hypothesis that the reflux subscale or the total score of the UCLA GIT 2.0 would be associated with endoscopic esophagitis or with a pathologic EGD in general. Data on the associations of the UCLA GIT 2.0 with objective upper GI tract findings in patients with SSc are scarce. Previous studies analyzed smaller groups of selected patients, in whom GI tract investigation and completion of the UCLA GIT 2.0 were performed systematically and within a narrow time interval [18,19,20]. A prospective study on 55 patients with SSc and clinically significant upper GI tract symptoms found a moderate correlation between the reflux scale of the UCLA GIT 2.0 with endoscopic esophagitis; the reflux subscale was also discriminative between patients with and without pathologic findings on esophageal manometry [18]. Another study on 40 patients with SSc, of whom 85% reported upper GI tract symptoms, found an association of higher reflux and total UCLA GIT 2.0 scores with decreased amplitude of distal esophageal contractions [19]. A very recent study on 31 patients with SSc, assessing esophageal motility dysfunction by scintigraphy, found a significant association of esophageal emptying activity with the GIT 2.0 reflux score, but not with the other subscales and the total UCLA GIT 2.0 score [20].

In our study on a large cohort of real-life patients, neither the total UCLA GIT 2.0 score nor the reflux subscale correlated with endoscopic esophagitis. The only parameters showing associations with this outcome were the EUSTAR-recorded “esophageal symptoms” (defined as the presence of reflux and/or dysphagia), and the mRSS. For the latter, the very low OR suggests the association is of little clinical importance. Not surprisingly, the symptom interpretation by the physician (as presence or absence of “esophageal symptoms”) performed better than single symptoms such as “heartburn”, as recorded in the patient EMR, in detecting patients with esophagitis.

The lack of correlation between esophagitis and single symptoms or the UCLA GIT 2.0 reflux scale may be explained by several factors, among which the non-systematic use of EGD, the variable time between EGD and the UCLA GIT 2.0 completion, and the use of PPI in about 60% of patients. Moreover, large studies performed by gastroenterologists in patients with gastro-esophageal reflux disease (GERD) have shown that an expert history, as well as GERD questionnaires, such as the reflux disease questionnaire and gastroesophageal reflux disease questionnaire, have important limitations when compared with objective testing for GERD by EGD or functional testing [21,22,23]. Studies with systematic EGD in unselected patients with SSc are scarce [24, 25]. In a single-center SSc cohort study, Petcu et al. found endoscopic esophagitis in 8/22 patients without any GI symptoms and in 39/57 patients with GI symptoms. Only 12/26 patients with gastroesophageal reflux symptoms had esophagitis on endoscopy [24]. The authors advocate for the routine use of EGD during the early stage of SSc, even in the absence of typical symptoms.

The strengths of our study rely in the large, real-life cohort of unselected patients from a tertiary SSc center with long-standing experience, and in the statistical methods applied, which allow adjusting for a large number of independent parameters potentially associated with the study outcomes. The study also has several limitations, which include the partially retrospective data collection. However, the large majority of the data were collected prospectively following the EUSTAR recommendations [1]. There was considerable variability in performing EGD, as in some patients this was not done despite being recommended, and in others it may have been done in another center, with the results not recorded in the EMR of our hospital. However, over 70% of EGD recommended by our center were performed and the respective results were available in the hospital EMR. It is possible that some reports of EGD performed outside our hospital may have not reached us, but we assume that in many of these cases EGD was probably not done, as our center strives to obtain all medical information of the patients and communication between local medical facilities is generally good. Another limitation is the time of ± 3 months allowed between questionnaire completion and EGD, which is quite long and may have contributed to the lack of correlation between UCLA GIT 2.0 scores and the results of the EGD. Finally, yet importantly, treatment with PPI was not recorded into detail and we were not able to analyze the indication for PPI, doses, or compliance.

Conclusions

In a large real-life cohort of unselected patients with SSc, we found a significant association of the UCLA GIT 2.0 score with the interpretation of GI symptoms by rheumatologists and consecutive recommendations for EGD. However, there was no association between the UCLA GIT 2.0 score, or its subscales, with endoscopic esophagitis, nor with any pathologic findings on EGD. Even the correlation between single symptoms, such as heartburn and dysphagia, and endoscopic esophagitis, was poor. We conclude that, while using the UCLA GIT 2.0 in the routine care of patients with SSc may help the rheumatologist to better understand the burden of GI symptoms in the individual patient, it should not be used as a stand-alone instrument to identify an indication of EGD. The question of whether all or selected patients with SSc should be investigated by EGD needs to be addressed by further studies.