Introduction

Gastric carcinoma (GC) is the third most common cause of cancer-related death worldwide, with the highest incidence recorded in Eastern Asia, South America, and Eastern Europe [1]. The 5-year survival of this disease is 25% [2], which suggests that in the majority of cases, the disease is diagnosed in the late stages. Given its high prevalence and poor prognosis, early detection and prevention are important for the successful management of GC.

GC risk depends on the severity and extent of the premalignant gastric conditions. Correa et al. [3] suggested that intestinal-type GC occurs as a result of progressive changes of the gastric mucosa, which implies that gastric mucosal atrophy and intestinal metaplasia are associated with increased carcinoma risk [4]. Moreover, earlier studies have confirmed that the relative risk of GC is high in patients with severe atrophic and/or metaplastic gastritis [5, 6]. Therefore, identifying high-risk population groups could facilitate the implementation of early intervention strategies, which in turn could improve the prognosis of GC.

In 2005, Rugge et al. proposed the Operative Link on Gastritis Assessment (OLGA) system for the grading and staging of the phenotypes of long-standing gastritis [7]. Subsequently, Capelle et al. proposed the Operative Link on Gastric Intestinal Metaplasia Assessment (OLGIM) system that recognizes intestinal metaplasia in gastric mucosa in an easier and more consistent manner [8]. OLGA and OLGIM are grading and staging standards developed from the Sydney System, which is dependent on the histopathology findings of gastroscopic biopsy sampling; these systems provide information on the extent of the atrophic or intestinal metaplastic changes related to cancer risk [7,8,9].

Patients with gastritis of stages III/IV, i.e., high-risk OLGA/OLGIM stages, have been reported to have significantly high risk for GC under different epidemiological settings [8, 10,11,12]. Although studies suggest that the OLGA and OLGIM systems have great potential for the risk stratification of GC, there is no systematic review or meta-analysis to support this claim. Moreover, doubts have been raised on whether the OLGA/OLGIM stages are accurate and whether these methods underestimate the cancer risk [13, 14].

Thus, there is still a lack of knowledge on the importance and accuracy of OLGA and OLGIM stages in the assessment of GC risk. To bridge this gap, we sought to undertake a systematic review and meta-analysis of case–control and cohort studies to elucidate the association between OLGA/OLGIM stages and GC risk and assess the strength of this association.

Methods

Search strategy

This meta-analysis was performed in accordance with the MOOSE statement guidelines [15]. A comprehensive search of medical literature was conducted using the PubMed, EMBASE, MEDLINE, and Cochrane databases for full-text articles as well as abstracts published up to March 31, 2017. We also screened for studies from the WHO International Clinical Trial Registration Platform (2004–2017) to obtain additional trials that would be most relevant to this review.

The literature search and review were carried out independently by two investigators (HY and LS). The keywords or corresponding Medical Subject Heading (MeSH) terms used for the search were as follows: “operative link on gastritis intestinal metaplasia assessment”, “operative link on gastritis assessment”, “gastric cancer”, and “gastric intraepithelial neoplasia”. The details of the search strategy are provided in Supplementary Appendix Table S1.

Study selection

All citations were imported into a reference manager (Endnote X8.0.1) for the assessment of eligibility for this meta-analysis. In a blinded, standardized manner, two investigators (HY and LS) independently reviewed the title, abstract, or full text of all the identified articles. The corresponding authors were contacted to collect missing data or assess eligibility. Any disagreements regarding the eligibility of a study were resolved by mutual discussion or consultation with a third reviewer (LB).

We included published case–control studies and cohort studies that met the following inclusion criteria: (1) evaluated and defined exposure of interest as OLGA/OLGIM stage III/IV; (2) endpoint of interest was incident gastric cancer (including incidence rate and cancer mortality); (3) the number of events was provided or could be calculated from the data in the publication; and (4) availability in English. Studies were excluded from this study if they were cross-sectional; were published as letters, reviews, editorials, case reports, or expert opinions; were lacking in extractable data; included subjects with precancerous conditions as controls; and were similar to others or duplicated.

Data extraction and quality assessment

All the data from the selected studies were extracted independently by two reviewers (HY and LS) and checked by a third reviewer (LB). In particular, data were collected for the following parameters: (1) the first author’s surname and publication year; (2) country of origin; (3) study design (case–control or cohort); (4) sample size (GC and control); (5) mean age; (6) sex; (7) staging system; (8) number of cases of GC and controls in OLGA or OLGIM stage III/IV; and (9) duration of follow-up.

The Newcastle–Ottawa Scale (NOS) was used for assessing the quality of the included case–control and cohort studies [16, 17]. Each item of the NOS was assigned 1 or 2 points. The maximum score possible for a given study was 9 points. Studies with score of 5 points or more were considered to be of high quality, while others were considered to be of low quality [18].

Data synthesis and analysis

Separate meta-analyses were performed for case–control and cohort studies. The primary outcome in the case of cohort studies was the incidence of GC, while the secondary outcome was high-grade dysplasia (occurring during the procedure). Heterogeneity between studies was evaluated using the Q and I2 statistics among the case–controls and cohorts, with I2 of > 50% indicating moderate to high heterogeneity [19]. The selection of the random or fixed model was based on the heterogeneity analysis [20]. The fixed-effect model was applied if I2 was < 50%; and the random-effect model was chosen if I2 ≥ 50%. An I2 of > 50% indicated substantial heterogeneity in studies; therefore, we performed subgroup analyses to detect the potential sources of heterogeneity. Additionally, ORs were used for case–control studies, whereas RRs were used for cohort studies. Further, sensitivity analyses were carried out to check the robustness of the pooled ORs and RRs by eliminating one study at a time. Publication bias was assessed by evaluating the funnel plot and Harbord test; in the Harbord test, a value > 0.05 was considered to indicate the absence of significant publication bias. All statistical analyses were performed using RevMan (version 5.3.0) and Stata (version 14.0) software packages.

Results

Study selection

Of a total of 452 articles retrieved by our literature search, 8 studies comprising 2700 cases were included in the meta-analysis of OLGA/OLGIM stages and gastric cancer risk, including two cohort studies [8, 10] and six case–control studies [21,22,23,24,25,26]. The study selection process is illustrated in Fig. 1.

Fig. 1
figure 1

Flow chart showing the selection of studies

Study characteristics

The salient features of the eight included studies are summarized in Table 1. The studies were published between 2008 and 2016. Four studies used both OLGA and OLGIM systems, while the remaining four studies were based on only OLGA. Furthermore, among case–control studies, the OLGA system was applied in all six studies [21,22,23,24,25,26], while the OLGIM system was only applied in three [23, 25, 26]. Furthermore, one of the two cohort studies included the OLGA [8, 10] system, while the other included the OLGIM [8] system.

Table 1 Characteristics of the studies included in the meta-analysis

Quality assessment

All eight studies had NOS quality scores greater than or equal to 5, indicating that all these studies had a high level of methodological quality. The remaining study had a NOS score of 4, indicating low study quality. The quality scores of the included case–control studies are provided in Table 2 and the cohort studies are shown in Table 3.

Table 2 Results of quality assessment using the Newcastle–Ottawa Scale for case–control studies
Table 3 Results of quality assessment using the Newcastle–Ottawa Scale for cohort studies

OLGA/OLGIM and the risk of GC

Case–control studies in OLGA

The six case–control studies included comprised a total of 2482 individuals and included one conference abstract [22] and five published manuscripts [21, 23,24,25,26]. Using a random-effect model, the meta-analysis of odds ratios (OR) demonstrated that GC risk was significantly higher among patients with OLGA stage III/IV (OR 2.64; 95% CI 1.84–3.79; P < 0.00001), but with significant heterogeneity (P = 0.03, I2 = 60%; Fig. 2).

Fig. 2
figure 2

Forest plot of odds ratio (OR) for gastric cancer (GC) of high stage of OLGA versus low stage in case–control studies. The cumulative GC risk among patients with OLGA stage III/IV was 2.64 (95% CI 1.84–3.79; I2 = 60%; n = 6)

Since heterogeneity was present, we performed subgroup analyses, as shown in Table 4, by stratifying the combined data according to the study region (Japan vs. Korea vs. China), sample size (< 100 vs. ≥ 100), NOS score (< 5 vs. ≥ 5), control selection (healthy control vs. unhealthy control), GC staging (early GC vs. mixed GC), and Lauren’s classification (intestinal GC vs. mixed GC). We found that the NOS score might be the potential source of heterogeneity in case–control studies based on OLGA. In studies of high quality (NOS score ≥ 5), OLGA stage III/IV were associated with an increased risk of GC in terms of OR (OR 2.41; 95% CI 2.02–2.88; P < 0.00001); however, the risk for GC was also high (OR 11.48; 95% CI 3.16–41.60; P = 0.0002) in the remaining study that had a low NOS score.

Table 4 Subgroup analyses of the association between high-stage OLGA and gastric cancer risk

Case–control studies in OLGIM

Three case–control studies comprising a total of 1266 individuals were included [23, 25, 26]. Using a fixed-effect model, the meta-analysis of OR manifested that GC risk was significantly higher among subjects with gastric lesions of OLGIM stages III/IV (OR 3.99; 95% CI 3.05–5.21: P < 0.00001), but no significant heterogeneity was observed (P = 0.39; I2 = 0%; Fig. 3).

Fig. 3
figure 3

Forest plot of odds ratio (OR) for gastric cancer (GC) of high stage of OLGIM versus low stage in case–control studies. The cumulative GC risk among patients with OLGIM stage III/IV was 3.99 (95% CI 3.05–5.21; I2 = 0%; n = 3)

Cohort studies based on OLGA

The two cohort studies [8, 10] included in this meta-analysis comprised 218 individuals. Meta-analysis of relative risk ratio (RR) using a fixed-effect model indicated a significant association between stages III/IV stage of OLGA and risk of developing GC (RR 27.70; 95% CI 3.75–204.87; P < 0.001; Fig. 4), without any significant difference in heterogeneity (P = 0.56, I2 = 0%). No funnel plot was constructed since the number of studies included was small.

Fig. 4
figure 4

Forest plot of risk ratio (RR) for gastric cancer (GC) of high stage of OLGA versus low stage in cohort studies. The cumulative GC risk among patients with OLGA stage III/IV was 27.70 (95% CI 3.75–204.87; I2 = 0%; n = 2)

Cohort studies based on OLGIM

One study [8] from Netherlands reported the association between patients with OLGIM stages III/IV and GC risk. The study investigated a prospective cohort of 125 patients who were diagnosed with intestinal metaplasia or dysplasia according to the OLGIM system; the study observed the incidence rate of GC or high-grade dysplasia at the end of the 6-year follow-up period; two patients with OLGIM stage III/IV developed high-grade dysplasia (RR 16.67; 95% CI 0.80–327.53).

Publication bias

The publication bias of nine studies was assessed by funnel plots and Harbord test. No obvious asymmetry was found in the three groups of case–control studies groups, and the results of the Harbord test also showed no evidence of publication bias in OLGA or OLGIM (P = 0.708 and P = 0.061, respectively). However, the enrolled cohort studies were too few to allow evaluation of publication bias.

Sensitivity analysis

A “leave-one-out” sensitivity analysis showed that our results were robust and individual elimination of each of the included studies did not cause any substantial variation in our findings. Pooled ORs varied between 2.42 (95% CI 1.82–3.22) and 2.83 (95% CI 2.10–3.82). After the elimination of a single case–control study in OLGA stages, pooled ORs varied between 2.91 (95% CI 1.74–4.88) and 4.11 (95% CI 3.11–5.42) in OLGIM stages, and the corresponding combined ORs did not change substantially.

Discussion

The OLGA and OLGIM systems are used as potential histological staging systems for the assessment of the risk of GC and have generated considerable interest in GC screening and surveillance. Although some studies failed to demonstrate a correlation between the detection of OLGA or OLGIM stage III/IV and a favorable outcome in GC [13, 14], several studies, albeit small and controversial, have shown the value of high-risk OLGA and OLGIM stages in the stratification of GC risk [8, 10]. To the best of our knowledge, no meta-analysis has been conducted on the importance and accuracy of OLGA and OLGIM system. Herein, we report the results of this meta-analysis to clarify this issue and explore the clinical value of these tools in the assessment of GC risk.

In the present study, we analyzed the results of these published observational studies on subjects with OLGA or OLGIM stage III/IV and the risk of GC, and our results confirmed that stage III/IV defined in both two systems had a marked correlation with GC risk. These findings were consistent with those of Rugge et al. [11], who first proposed the concepts of OLGA and OLGIM in a study demonstrating that most cases of high-grade neoplasia or invasive gastric neoplasia consistently showed high-risk OLGA or OLGIM stages (97.6% for OLGA stages, and 92.7% for OLGIM stages).

More specifically, for the OLGA staging system, we enrolled six case–control studies and two cohort studies in this meta-analysis. It is worth highlighting that the follow-up duration of these two prospective cohort studies was more than 6 years (one study followed the participants for 12 years), which may be considered adequate. Analysis of cohort studies revealed that individuals with high-risk OLGA stages had a 27.7-fold higher risk of developing GC as compared to their counterparts. Although the case–control studies showed significant heterogeneity, the summarized OR for all studies showed a positive relationship between high OLGA stages and GC.

Putting together the results of subgroup analysis and study quality assessment, the study by Satoh et al. [21] showed that the clinical heterogeneity may be caused by a low NOS score of all case–control studies. Therefore, their analysis suggested that high-stage OLGA could serve as an independent risk factor for GC. Moreover, the three case–control studies and one cohort study employed the OLGIM staging system. In that study, analysis of these three case–control studies showed that subjects with higher stages of OLGIM had a 3.99 times higher risk of GC than others. Meanwhile, results for the single cohort study agreed with those for the case–control study. In the sensitivity analysis of our study, no substantial changes were discovered; additionally, no publication bias was detected in this study, indicating that our combined results may be unbiased. Consequently, subjects with high-risk OLGA/OLGIM stages had a higher risk of GC as compared to those with low-risk stages.

It is evident that the OLGA and OLGIM classification systems have considerable clinical value in GC screening and surveillance of precancerous gastric lesions. As far as these two systems are concerned, the suitable surveillance intervals for patients with precancerous conditions still remain controversial. According to the guidelines of the European Society for Gastrointestinal Endoscopy, endoscopic surveillance should be performed in patients with extensive atrophy and/or extensive intestinal metaplasia every 3 years [11]. However, in accordance with recent recommendations regarding monitoring for these patients in Chinese and Japanese populations, patients with extensive atrophic gastritis and/or intestinal metaplasia should undergo endoscopic surveillance every 1 year; those with moderate atrophic gastritis, every 2 years; and those with none-to-mild gastritis, every 3 years (or adjusted by the patient’s condition) [27,28,29]. Combined with our findings, we suggest that patients aged above 40 years should undergo upper gastrointestinal endoscopy for GC screening, with staging of gastric lesions as per the OLGA and OLGIM systems. More importantly, surveillance intervals for patients with OLGA and OLGIM stage III/IV should be shortened, even in the absence of any obvious lesions, and endoscopists should be cautious and take more biopsy specimens (if necessary) to enable early diagnosis of GC or early GC. In addition, the OLGA staging system was reported [30, 31] to show low interobserver agreement between the general pathologists, but with a higher sensitivity. Meanwhile, OLGIM staging system is characterized by a higher interobserver agreement. However, substantial proportion of potentially high-risk individuals would have been missed if only OLGIM staging is applied [32, 33]. Therefore, we too recommend that a combination of OLGA and OLGIM systems can more accurately assess the risk of GC in routine clinical practice.

There are some inevitable limitations in this analysis. The retrieved studies on OLGA and OLGIM systems were not focused on clinical research related to intervention or prognosis; therefore, randomized controlled trials focusing on this condition have been scarce. Owing to the fundamental methodological limitations of observational studies, the importance of the findings of such studies could be attenuated to some extent. Furthermore, a majority of the studies had a small sample size, and the number of studies in each research type was limited, which might have affected the integrity and authenticity of the collected data. Besides, there was no uniformity in the selection of controls, with some of them being non-cancer patients.

The control population comprised subjects with various diseases, such as functional dyspepsia, gastric ulcer, duodenal ulcer, and chronic atrophic gastritis. However, during sensitivity analysis, control selection was not identified as a potential source of heterogeneity in these subgroup analyses, which meant that its influence might be limited. In addition, we did not account for the staging and typing of GC in our study, since our main focus was to confirm whether the OLGA and OLGIM stages had a significant correlation with GC by comparing the GC population and controls; however, it is undeniable that some included studies had mixed populations, which does not allow for differentiation. Nevertheless, we also performed subgroup analysis of the two definitive studies of patients with early GC, which showed positive findings, but with high heterogeneity.

The significance of these two systems lies in the screening and surveillance for precancerous gastric condition and early GC; however, their validity for comparing different types of advanced GC may be less critical. Lastly, although we conducted a thorough search and sought to cover as many studies as possible, some studies may be unpublished and could be ignored. However, both the funnel plot and Harbord test showed no obvious publication bias in our results, indicating the influence of these studies might be limited.

More high-quality, large-scale, multi-center clinical studies are warranted on the application of the OLGA or OLGIM staging systems in the detection of early GC. Moreover, it is necessary to compare the accuracy of the OLGA and OLGIM systems with more high-quality studies, because investigators have expressed paradoxical opinions regarding the preferred system for the evaluation of gastric cancer risk [8, 32,33,34]. Furthermore, studies on the combined application of OLGA and other testing methods, such as pepsinogen ratio and measurement of gastrin 17 levels, may be beneficial for the comprehensive assessment of GC risk assessment.

In summary, our meta-analysis revealed that stage III/IV of the OLGA or OLGIM system was indeed associated with increased risk of gastric cancer. In clinical practice, this translates as the need for frequent and careful monitoring of patients with higher OLGA and OLGIM grades to facilitate early diagnosis and intervention and better prognosis.