Introduction

Gastric cancer (GC), a common highly recurrent malignant tumor, is the fifth most prevalent tumor, and is the third most frequent primary cause of tumor-related death worldwide [1]. Multiple markers or methods predict the prognosis and therapeutic outcomes of patients with GC. Among these, nodal involvement has long been considered as one of the greatest maker for prognosis and metastasis [2, 3]. For this reason, the International Union Against Cancer (UICC) and the American Joint Committee on Cancer (AJCC) have adopted categories based on the number of metastatic lymph nodes (LNs) as the basis for the N stage of the tumor–node–metastasis (TNM) classification. The TNM staging system is the most widely used system for GC staging assessment. For optimal staging of GC, the latest edition recommends the examination of 16 or more LNs for nodal metastatic status determination [4]. Unfortunately, poor compliance was demonstrated with the recommendation: > 15 LNs were removed in only 29% of patients, and no nodes were removed at all in 9% [5,6,7]. Multiple studies have demonstrated that the accuracy of pathological N (pN) classification for patients’ prognostic assessment is affected by the number of examined LNs [6, 8]. Particularly, if the number of examined LNs is less than 16, N3b patients may be inappropriately classified as N3a because the cutoff for N3a/N3b is 15 metastatic LNs [9]. This phenomenon has been referred to as stage migration. To overcome the potential bias associated with the pN classification, other parameters have been proposed. Some authors have claimed that the ratio-based lymph node system (rN) can be an alternative for patients with fewer than 15 LNs examined, defined as the ratio of metastatic LNs to examined LNs [10,11,12]. Chen et al. showed that lymph node ratio (LNR) carries more information than the pN classification in patients with an inadequate node count [13]. However, the rN0 classification was congruent with the pN0 classification in prognostic assessments of node-negative patients with GC [12, 14]. For this reason, another novel solution, the log odds of positive lymph nodes (LODDS), was proposed recently [11]. LODDS, as a novel prognostic indicator, is defined as the log of the ratio between the number of positive nodes and the number of negative nodes and further discriminates patients with N0 GC [14, 15]. LODDS, first proposed in breast cancer in which it performed equally well as a prognostic indicator in node-positive and node-negative patients [16], was later generalized to several cancers, including GC [17,18,19,20,21,22]. The LODDS classification was superior in predicting the prognosis of GC patients with < 15 examined LNs and those with N0 status compared with the rN and pN classifications [11, 23,24,25]. However, the studies supporting this finding have several limitations, including a lack of large, multicenter surveillance studies and available inclusion criteria for patients and different predicative capabilities of LODDS in different studies.

We conducted a systematic review and meta-analysis to summarize the predictive and prognostic ability of the LODDS staging system and compare it with the rN and pN classification systems to address these limitations.

Methods

Data sources

We searched PubMed, Medline, Embase, Web of Science and the Cochrane Library for relevant studies from inception to March 7, 2022, which formed the basis for evidence used to conduct the meta-analysis. The following keywords were used: “log odds of positive lymph nodes”, “LODDS”, and “gastric cancer”. We used the following strategy: (((((((((gastric neoplasms[MeSH Terms]) OR (stomach neoplasms)) OR (stomach carcinoma)) OR (gastric tumor)) OR (gastric carcinoma)) OR (stomach cancer)) OR (stomach tumor)) OR (gastric neoplasms)) OR (GC)) AND ((log odds of positive lymph nodes) OR (LODDS)). We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement for reporting systematic reviews [26] and registered our review on Prospero (https://www.crd.york.ac.uk/prospero/). The PROSPERO registration number is CRD42021274996.

Inclusion and exclusion criteria

We included studies that enrolled ≥ 100 patients with GC (diagnosed with the gold standard test) who were classified by the LODDS, rN, and pN classification systems of the AJCC and followed for at least five years. We included studies that reported at least one of the outcomes of interest or studies wherein the outcome could be calculated according to data extracted from the published data. We excluded conference abstracts, reviews, case reports, ongoing trials, open-label trials, comments, letters, meeting records, and studies that enrolled patients with a total sample size of less than 100 patients, as studies with very small samples are more prone to bias and contribute little information to pooled analyses. We excluded articles that shared a study population with another article and those that did not provide crucial information needed for detailed stratification.

Study selection

Using a standardized form, two reviewers independently screened the titles and abstracts identified by the search, and the full text was obtained to check eligibility. When a disagreement arose and they were unable to reach a consensus, an adjudicator was consulted to resolve discrepancies.

Data extraction

Using a standardized form, two reviewers worked independently to perform duplicate data abstraction in each eligible study. When a disagreement arose and they were unable to reach a consensus, an adjudicator helped to resolve disagreements. We collected information regarding study characteristics (including year of publication, single-center/multicenter study, clinical study design, country of patients, and patient year), patient characteristics (including patient number, tumor stage, patient age, population, neoadjuvant therapy, number of harvested LNs, and number of metastatic LNs), and all patient-important outcomes (overall survival (OS)).

Outcomes and quality assessment

Prognostic values (OS) were used to compare the different LODDS groups. We compared the predictive and prognostic abilities of the LODDS staging system with those of the rN and pN classification systems.

Using the Newcastle–Ottawa Scale (NOS) [27], two investigators, independently and in duplicate, assessed the quality of the included articles, including study group selection, comparability of groups and outcome of interest. The full score was 9 points, and a score of 1–4 points indicated low-quality, while a score of 5–9 points indicated high-quality.

Data analysis and statistical methods

The data were analyzed using Stata statistical software version 16.0 (Stata Corporation, College Station, TX) and SPSS version 22 (IBM Corporation).

To statistically assess the prognostic effects of LODDS, we extracted the hazard ratio (HR) and 95% confidence interval (CI) of 5-year OS from the included studies. If not directly provided in the original literature, the estimated HR and 95% CI were used to assess prognostic effects based on the method described by Tierney et al. [28], and an HR greater than 1 suggested a higher risk of disease progression or death in the patients. A random-effects model (REM) was used to pool the data. All statistical values were combined with 95% CIs and P values for two-sided testing at α = 0.05. To assess the existence of heterogeneity among studies, we tested OS for heterogeneity across cohorts using Cochran’s Q statistic and used I2 to measure the extent of heterogeneity [29]. For the I2 statistic, heterogeneity was defined as low (25%–50%), moderate (50%–75%) or high (> 75%) [30]. For the Q statistic, P ≤ 0.1 was considered to indicate significant heterogeneity. In addition, subgroup analyses were performed based on the differences in baseline characteristics and risk factors in the data retrieved. Then, we also conducted a sensitivity analysis in which each study was removed in turn to evaluate the undue influence of the study on the overall summary estimates including Duval and Tweedie’s trim-and-fill method [31], and Galbraith plots [32]. Publication bias was investigated with qualitative and quantitative methods, including funnel plots and Egger’s test [33]. The groups in each pN and rN stage were regrouped in accordance with LODDS, and the OS differences within groups and between groups were analyzed using chi-squared tests to compare the homogeneity of the three staging methods. P values for the pooled results were two-sided, and the significance level was 0.05.

Results

Study characteristics

Based on the search strategy mentioned above, the original search yielded 209 records. Finally, we included 12 unique studies [11, 12, 14, 15, 23,24,25, 34,35,36,37,38] published between 2010 and 2021 with 20,312 patients according to the eligibility criteria. The flowchart of the search and selection process is shown as a PRISMA flowchart in Fig. 1.

Fig. 1
figure 1

Flow diagram of study selection

Tables 1 and 2 summarizes the main characteristics of the studies included in the meta-analysis. Because of the different hierarchical levels of LODDS in the included studies, we re-stratified the groups in the meta-analysis. Eleven studies were retrospective, and one study used a prospectively maintained database. Among these studies, eight studies were from China [11, 12, 14, 23, 24, 35, 37, 38], and one study each was from Hungary [25], Korea [34], America [15], and Italy [36]. Among the included studies, only five studies [11, 24, 25, 35, 36] included patients without neoadjuvant therapy, three studies [15, 23, 37] did not limit the number of patients who received or did not receive neoadjuvant therapy, and the remaining studies [12, 14, 34, 38] did not report the relevant information. Four studies [11, 12, 14, 38] compared the predictive and prognostic abilities of the LODDS staging system with those of the rN or pN classification systems.

Table 1 Characteristics of included studies for the meta-analyses
Table 2 Clinicopathologic characteristics of included studies for the meta-analyses

Study analysis

LODDS and OS in GC

We analyzed OS in different LODDS categories according to the data from the included articles. The results of the pooled analysis are summarized in Table 3.

Table 3 Results of subgroup analyses on prognostic effects of GC patients

When pooling the HR for OS, LODDS1, LODDS2, LODDS3, and LODDS4 in GC patients were correlated with poor OS compared with LODDS0 (Fig. 2). Patients with LODDS1 had inferior OS compared with those with LODDS0 (HR = 1.62, 95% CI (1.42, 1.85)), and the heterogeneity was significant (I2 statistic = 63.5%, P heterogeneity = 0.003). The pooled results indicated that LODDS2 GC patients had a worse OS (HR = 2.47, 95% CI (2.02, 3.03)) than LODDS0 GC patients. There was significant heterogeneity (I2 statistic = 86.2%, P heterogeneity < 0.001). Compared with LODDS0 GC patients, LODDS3 GC patients had a worse OS (HR = 3.15, 95% CI (2.50, 3.97)), and the heterogeneity was significant (I2 statistic = 92.1%, P heterogeneity < 0.001). The results of the pooled analysis using the REM showed that LODDS4 GC patients were also associated with poorer OS (HR = 4.55, 95% CI (3.29, 6.29)) than LODDS0 GC patients, and between-study heterogeneity was significant (I2 statistic = 96.6%, P heterogeneity < 0.001). Overall, as the LODDS grade increases, the prognosis of patients with GC becomes increasingly worse.

Fig. 2
figure 2

Estimated HR summary for OS. a LODDS1 vs. LODDS0, b LODDS2 vs. LODDS0, c LODDS3 vs. LODDS0, d LODDS4 vs. LODDS0. HR > 1 indicates more disease progression or deaths in the patients. Data were pooled using a random-effects model (REM). All statistical values were combined with 95% CIs and two-sided P-values, the threshold of which was set to 0.05. Abbreviations: HR, hazard ratio; OS, overall survival; LODDS, log odds of positive lymph nodes

Subgroup analysis

To analyze the potential sources of between-study heterogeneity, we performed subgroup analysis according to differences in the variables, including the publication year, country, patient number and whether patients received neoadjuvant therapy. The results are summarized in Table 3. Consistent with the above results, LODDS1, LODDS2, LODDS3, and LODDS4 GC patients had a worse OS than LODDS0 GC patients in all subsets. After subgroup analysis, it was found that the heterogeneity was mainly from whether patients received neoadjuvant therapy. In patients without neoadjuvant therapy, the result of the pooled analysis using the REM showed that LODDS1, LODDS2, LODDS3, and LODDS4 GC patients were also associated with poor OS (LODDS1 vs. LODDS0: HR = 1.85, 95% CI (1.64, 2.07); LODDS2 vs. LODDS0: HR = 3.04, 95% CI (2.58, 3.58); LODDS3 vs. LODDS0: HR = 4.03, 95% CI (3.66, 4.43); LODDS4 vs. LODDS0: HR = 6.36, 95% CI (4.50, 8.99)) compared with LODDS0 GC patients. The heterogeneity of OS decreased to a nonsignificant level except for patients in the LODDS4 vs. LODDS0 group (LODDS1 vs. LODDS0: I2 statistic = 0.0%, P heterogeneity = 0.381; LODDS2 vs. LODDS0: I2 statistic = 29.2%, P heterogeneity 0.227; LODDS3 vs. LODDS0: I2 statistic = 0.0%, P heterogeneity = 0.975; LODDS4 vs. LODDS0: I2 statistic = 81.3%, P heterogeneity < 0.001).

To explore the potential sources of heterogeneity in the LODDS4 vs. LODDS0 group, we also used Galbraith plots and Duval and Tweedie’s trim-and-fill method to further explore the source of heterogeneity in OS, and the results showed that the training set of the study by Jian-Hui C et al. [35] might have mainly contributed substantial heterogeneity to OS (Fig. 3a). After omitting this study, the pooled HR was not obviously affected (HR = 5.13, 95% CI (4.46, 5.68); Fig. 3b), but the heterogeneity for OS dropped to a nonsignificant level (from I2 statistic = 81.3%, P heterogeneity < 0.001 to I2 statistic = 2.2%, P heterogeneity = 0.381; Fig. 3c).

Fig. 3
figure 3

Process of exploring the potential sources of heterogeneity on OS. a galbraith plot for OS. b Forest plot for OS after Jian-Hui C et al. (2016) is omitted. c change of heterogeneity before and after Ogawa S et al. (2016) is omitted. Weights are from random-effects analysis. P value for heterogeneity. Abbreviations: HR, hazard ratio; OS, disease-free survival; SE, standard error

Publication bias

Publication bias was assessed by funnel plots and Egger’s test. Formal evaluation using Egger’s test failed to identify significant publication bias in the analyses of LODDS1 vs. LODDS0 (p = 0.608), LODDS2 vs. LODDS0 (p = 0.799), LODDS3 vs. LODDS0 (p = 0.943), and LODDS4 vs. LODDS0 (p = 0.216) for OS. The results with P values for Egger’s test are listed in Table 3. In addition, we used funnel plots to detect publication bias, as shown in Figure S1. All of the funnel plots of the included articles showed a symmetrical distribution. Thus, no significant publication bias was found in the meta-analyses of OS.

Comparison of the homogeneity of prognostic assessments

Four studies [11, 12, 14, 38] compared the predictive and prognostic abilities of the LODDS staging system with those of the rN or pN classification systems. OS rates were compared among different pN and rN classifications when stratified by the LODDS classification and among different LODDS classifications when stratified by the pN or rN classification. The data from all these studies were therefore pooled, as shown in Table 4. Thus, for patients in each of the pN classifications, significant differences in survival were consistently observed among patients with different LODDS classifications. And, for patients in each of the rN classifications, significant differences in survival were consistently observed among patients with different LODDS classifications. Meanwhile, for patients in each LODDS classification, prognosis was highly homologous for those with different pN or rN classifications. These results indicated that the LODDS classification might be superior to the pN and rN classifications for prognostic assessment.

Table 4 Overall survival rates according to different pN and rN classifications stratified by the LODDS staging system

Discussion

Despite recent advances in the treatment of patients with GC, the OS is far from satisfactory. The accuracy of staging systems for patients with GC is important for predicting long-term survival, guiding treatment, and identifying patients for clinical trials. Because of the importance of nodal involvement in assessing prognosis and defining the management of patients with GC [2, 3], there has been intense interest in defining an optimal LN staging system. In fact, numerous different parameters have been proposed to stratify the long-term prognosis of patients with GC according to the status of LN metastasis [39, 40]. The UICC/AJCC staging systems for GC stratify the nodal staging, namely pN stage, according to the number of positive LNs. The major flaw of the pN classification is that it only considers the number of metastatic LNs, but ignores the influence of the number of examined LNs, which may lead to stage migration [13, 41,42,43,44]. To overcome the potential bias associated with the pN classification, other parameters have been proposed by analyzing both the number of examined LNs and the number of metastatic LNs. The rN is a simple method for nodal staging that is less influenced by the number of examined LNs than the pN classification. Some authors have claimed that the rN classification can be an alternative to the pN classification for patients with fewer than 15 LNs examined [45,46,47], and the rN classification can minimize the phenomenon of stage migration [48, 49]. However, the rN0 classification was congruent with the pN0 classification in prognostic assessments of node-negative patients with GC [12, 14], and a minimum number of LNs still need to be examined to ensure accuracy for prognostic assessment. LODDS, a novel prognostic LN-related index that considers the effects of the numbers of both positive LNs and negative LNs, was developed to improve the accuracy of prognostic assessment, particularly in patients with N0 status or with < 15 total harvested LNs [50]. The LODDS classification is also considered superior to the rN and pN classifications because it can be used to study LN involvement in patients at all classification levels [11, 51,52,53].

This is the first meta-analysis that focused on the crucial roles of LODDS in predicting the prognosis of patients with GC. As a result, our study is more informative than any previous study. Our meta-analysis of twelve articles including 20,312 GC patients indicated that LODDS1, LODDS2, LODDS3, and LODDS4 patients have poor OS compared with LODDS0 patients, which shows that LODDS has prognostic value in patients with GC. When pooling the HR for OS, LODDS1, LODDS2, LODDS3, and LODDS4 in GC patients was correlated with poor OS compared with LODDS0 (LODDS1 vs. LODDS0: HR = 1.62, 95% CI (1.42, 1.85), I2 statistic = 63.5%, P heterogeneity = 0.003; LODDS2 vs. LODDS0: HR = 2.47, 95% CI (2.02, 3.03), I2 statistic = 86.2%, P heterogeneity < 0.001; LODDS3 vs. LODDS0: HR = 3.15, 95% CI (2.50, 3.97), I2 statistic = 92.1%, P heterogeneity < 0.001; LODDS4 vs. LODDS0: HR = 4.55, 95% CI (3.29, 6.29), I2 statistic = 96.6%, P heterogeneity < 0.001). As the LODDS grade increases, the prognosis of patients with GC becomes correspondingly worse. However, we discovered high levels of heterogeneity in all groups. To analyze the potential sources of between-study heterogeneity, we performed subgroup analysis according to differences in the variables, including the publication year, country, patient number and whether patients received neoadjuvant therapy. After subgroup analysis, it was found that the heterogeneity was mainly from whether patients received neoadjuvant therapy. Additionally, we found that patients without neoadjuvant therapy had poorer OS than patients with neoadjuvant therapy in the same group. Therefore, the results suggest that whether patients receive neoadjuvant therapy is one of the most important factors affecting the prognostic effectiveness of LODDS in patients with GC. Although we performed a subgroup analysis according to whether patients received neoadjuvant therapy, significant heterogeneity was still found in the pooled analysis of OS in the LODDS4 vs. LODDS0 group. To explore the potential sources of heterogeneity, we also used Galbraith plots and Duval and Tweedie’s trim-and-fill method to further explore the source of heterogeneity in OS, and the results showed that the training set of the study by Jian-Hui C et al. [35] might have mainly contributed substantial heterogeneity to OS.

Additionally, we assessed differences in survival among patients in different LODDS classifications for each of the pN or rN classifications. We summarized the results from four studies [11, 12, 14, 38] that compared the predictive and prognostic abilities of the LODDS staging system with those of the rN or pN classification systems. For patients in each of the rN and pN classifications, significant differences in survival were consistently observed among patients in different LODDS classifications. Meanwhile, for patients in each LODDS classification, prognosis was highly similar between patients with different pN or rN classifications (see Table 4). Thus, we considered that the superiority of the LODDS classification to the rN and pN classifications was mainly because of its potential to discriminate patients with the same ratio of node metastasis but different survival.

However, several limitations of the current meta-analysis should be emphasized. First, as several studies did not report HRs, HRs in this meta-analysis were estimated based on the method described by Tierney et al. [28]. Second, due to the lack of relevant information on the anatomic nodal group locations, we did not analyze the effect of the anatomic nodal group locations on the prognostic effectiveness of LODDS in patients with GC. Third, the optimal cutoff value of LODDS is not concluded. One critical problem is lack of consensus on the optimal cutoff value of LODDS when using it in the clinic and in experiments. Thus, LODDS is not yet used clinically on a large scale. Fourth, we were unable to apply the AUC, C-index, or AIC values to determine which of the LODDS, rN or pN classifications is superior because we did not have access to detailed patient information. Despite these limitations, this is the first meta-analysis focusing on the crucial roles of LODDS in predicting the prognosis of patients with GC. The results suggest that LODDS can predict the survival of GC patients. Moreover, it may be a novel prognostic predictor and a more accurate and sensitive stratification tool for use in clinical studies.

Conclusion

Our systematic review demonstrated that LODDS is correlated with the prognosis of GC patients and more accurately predicts the survival of GC patients than previous methods. As the LODDS grade increases, the prognosis of patients with GC becomes correspondingly worse. Additionally, we found that patients without neoadjuvant therapy had poorer OS than patients with neoadjuvant therapy in the same LODDS group. Therefore, the results suggest that whether patients receive neoadjuvant therapy is one of the strongest factors affecting the prognostic effectiveness of LODDS. Moreover, the LODDS classification is superior to the pN and rN classifications for prognostic assessment. Incorporating LODDS into the staging system of GC will enable clinicians to predict the prognosis of patients more accurately. Further high-quality, large-scale, international, well-designed multicenter prospective studies are needed to obtain the optimal cutoff point of LODDS and to find a simple and repeatable way to calculate LODDS values to facilitate the utilization of LODDS in the clinic.