Introduction

Despite the overall decline in incidence, stomach cancer remains one of the leading causes of cancer-related deaths worldwide, accounting for nearly 1 million new cases each year [1]. Relapse is common after radical gastrectomy, with relapse rates exceeding more than half of treated cases [2]. The median survival time is usually less than 12 months after radical gastrectomy [2], emphasizing not only the urgent need to improve the current treatment and staging system of gastric cancer (GC), but also the need for an efficient screening program, since early GC has a significantly better prognosis than advanced GC. Surgery remains the only potentially curative treatment for patients with GC.

Moreover, correct staging of the disease is vital, as the stage of the disease establishes the prognosis and the further management of the patient [3]. The current staging systems used are the 14th JCGC [4] and 7th TNM [5] guidelines, both requiring retrieval of at least 16 lymph nodes in order to accurately stage the disease, although some studies point out that this cut-off is not optimal [6,7,8].

Nodal status is the most powerful prognostic factor in GC [9]. Hence, it is essential to precisely characterize the nodal status in order to appropriately stage the disease and treat the patient properly. Other systems, like lymph node ratio (LNR) and log odds of positive lymph nodes (LODDS), have been developed in order to provide a more accurate lymph node status [10,11,12,13]. Which of the aforementioned systems has the most accurate prognostic value is highly debatable, so that further studies must be carried out to reach a consensus. Whichever the case may be, none of the systems take into account the factors that influence the total lymph node yield. These factors are patient related, surgery related, pathology related, treatment related, and can influence the total number of retrieved lymph nodes [14,15,16,17]. Consequently, these factors may modify the total lymph node yield and lead to a more accurate staging of the disease, improving the treatment and the survival of the patient.

The purpose of this study is to determine whether surgical ex vivo lymphodissection (EVLD) influences the total number of retrieved lymph nodes and to determine if this supplementary procedure can impact the staging system and the postsurgical treatment. In order to achieve this, we conducted a retrospective–prospective study, comparing surgical EVLD vs. non-ex vivo lymphodissection (non-EVLD), and we reviewed the current literature regarding this topic. Finally, we performed a systematic literature review and meta-analysis in which we included our own study and four other studies that matched the inclusion criteria.

Materials and methods

Clinical study

The current retrospective–prospective study enrolled 75 white Caucasian patients with gastric adenocarcinoma who underwent radical gastrectomy with at least D2 lymphodissection between April 2008 and April 2017. For this study, the participants gave a written informed consent according to protocols approved by the Fundeni Clinical Institute Ethics Committee. All gastrectomies were performed by a single experienced surgeon at the Department of General Surgery, Fundeni Clinical Institute. Starting from January 2016, the surgeon added surgical EVLD to the management of patients with the aforementioned features. The group was subdivided according to the presence or absence of EVLD. From 75 patients, 12 underwent EVLD while 63 did not, the latter accounting for the non-EVLD group.

By non-EVLD we mean that the surgically removed specimens were submitted directly to the Pathology Department without being previously processed by the surgeon. As for the EVLD group, the removed specimens were rigorously dissected in the operating room by the instructing surgeon, straight after the surgery was finished. During the dissection, all isolated lymph nodes were ordered according to their afferent lymph node groups and submitted to the Pathology Department for histopathological analysis.

Exclusion criteria were the following: diagnosis other than adenocarcinoma and previous subtotal gastrectomy for another condition.

We included the following clinical and pathological variables in this study: age and gender of the patients, total lymph node count, positive lymph node count, tumor grade, stage of the disease according to the JGCA v.3 (Japanese Gastric Cancer Association, version 3), extent of gastrectomy, type of lymphodissection, and the presence of splenectomy.

We used IBM SPSS 22 software (IBM, Armonk, NY, USA) to analyze our data. Differences between the two subgroups were tested by means of Fisher’s exact test and Mann–Whitney nonparametric test.

Systematic literature review and meta-analysis

Search strategy

Two authors (MDB and MD) independently searched articles indexed in PubMed (updated by May 20, 2017) using the following MeSH terms and keywords simultaneously: “Lymph Node Excision/methods” [MeSH] OR “Lymph Node Excision/mortality” [MeSH] AND “Stomach Neoplasms/pathology” [MeSH] OR “Stomach Neoplasms/surgery” [MeSH] OR “ex vivo lymphodissection” OR “ex vivo lymphadenectomy” OR “lymph node number” OR “lymph node yield” OR “lymph node retrieval” AND “gastric cancer.”

Inclusion and exclusion criteria

Only original research articles were included. There were no limitations imposed by language. Studies were considered eligible if the following criteria were simultaneously met: (1) investigated gastric adenocarcinoma, (2) compared the number of lymph nodes harvested by the pathologist alone with the number of LN harvested by the surgeon and the pathologist (Fig. 1).

Fig. 1
figure 1

Flowchart of steps in the systematic review

Quality assessment

We used the Newcastle–Ottawa Scale (NOS) to assess the quality of the included studies. A study that scored 6 or more stars was considered to be of good quality [18].

Data extraction

Data extraction was also performed simultaneously by two authors (MDB and MD), and the obtained data were subsequently compared between the two, in order to exclude any error. The extracted data for this study included: first author, publication year, country, EVLD and non-EVLD sample size, mean number of retrieved lymph nodes for the two groups, and the specific standard deviation of each group.

Meta-analysis

We used Review Manager (RevMan) software, version 5.3 (The Cochrane Collaboration, Software Update, Oxford) to analyze the data. Statistical heterogeneity between studies was measured by the I2 statistic (an I2 value of >50% indicated heterogeneity) and Cochran’s Q test. Mean difference (MD) with a 95% confidence interval (95% CI) was calculated for continuous variables. An I2 value >50% implies the use of a random effect model and a value <50% the use of a fixed effect model [19, 20]. A p-value of ≤0.05 regarding the test overall effect was considered to be statistically valuable and the pooled effect was considered significant.

Performing a funnel plot in order to evaluate the publication bias is justified only if a minimum of 10 studies are included in the meta-analysis [21].

Results

Clinical study

Comparing the differences in total lymph node yield proved to be of statistical significance (p < 0.001). The total lymph node count mean value was 23.21 for the non-EVLD group, whereas for the EVLD group it was 34.75. This result can be only partially explained by the higher percentage of more than D2 lymphodissections in the EVLD group (41.7%) than in the non-EVLD group (14.3%), the difference being statistically significant (p = 0.045). Despite the increase in total lymph node yield in the EVLD group, the percentage of positive lymph nodes in this group did not change after this procedure.

No statistically significant differences were noted regarding the mean positive lymph node yield (p = 0.758 for Mann–Whitney test) between the two groups. The non-EVLD group had a mean of 3.9 positive lymph nodes, while the EVLD group had a mean of 3.42.

No statistically significant differences regarding age, gender, pT, pN, stage of the disease, tumor grade, presence of splenectomy, and extent of surgery between the two groups were noted according to Fisher’s exact test (Table 1).

Table 1 Patient- and tumor-related variables of our study

Systematic review and meta-analysis

After searching the mentioned terms in PubMed, we identified 905 potentially relevant articles, out of which 890 were excluded by title and abstract. The remaining 15 articles were fully assessed and only 4 articles were eligible for our meta-analysis. No other studies were identified through manual search of the reference lists. The flowchart of the systematic literature search is represented in Fig. 1. All the studies achieved a score of 7 or above on the NOS, the mean score was 7.6 stars. The results are shown in Table 2.

Table 2 Studies included in the meta-analysis

Two studies were conducted in the USA [22, 25], one in the Netherlands [23], and one in China [24]. To the aforementioned articles, we added our personal clinical experience regarding this topic and conducted a meta-analysis including five articles, with a total of 623 patients in the EVDL group and 811 in the non-EVDL group. Significant inter-study heterogeneity (I2 > 58%, chi2 = 9.63, df = 4, p = 0.05) was noted and we used a random effect model.

All studies report a significant mean difference (MD) in lymph node count between the two groups in favor of the EVDL group. The cumulative MD between the EVDL group and the non-EVDL group is 11.52 (p < 0.00001; Fig. 2).

Fig. 2
figure 2

Results of the meta-analysis (forest plot). EVLD group „ex vivo lymphodissection group“, Non-EVLD group „non-ex vivo lymphodissection group“, SD „standard deviation“, IV „interval variable“, random „random effect model“, 95% CI „95% confidence interval“

Because only five studies were included in the meta-analysis, using a funnel plot was not justified.

Discussion

The limitations of our study are obvious, namely the small number of patients included, especially in the EVLD group. Hence, we decided to perform a meta-analysis which included all the other studies that compared EVLD with non-EVLD, adding our study as well. Moreover, through the meta-analysis, we wanted to point out the degree of statistical heterogeneity among the included studies.

This study focuses on a single surgeon’s experience with gastric adenocarcinoma and at least D2 lymphodissection, which is an advantage because the total lymph node count is known to also be influenced by the surgeon’s experience and diligence [14, 23, 26]. Consequently, the total lymph node yield variability caused by these differences in experience and diligence have been removed. Besides our study, only one was centered on a single surgeon’s experience: this included 8 patients with EVLD and 27 patients with non-EVLD treated by one Japanese instructing surgeon and confirmed our finding. In this case, the instructing surgeon retrieved a mean lymph node number of 60, while the pathologist retrieved 31 (p = 0.003; [23]).

None of the studies included in the meta-analysis [22,23,24,25], except one [24], reported an increase in positive lymph node count and overall survival, despite an increase in total lymph node count. The small number of patients included in these studies and the short following time may account for the absence of significant differences regarding the staging and overall survival [22,23,, 23, 25].

Opposite to these results, Jiang et al. reported a significant difference in stage-to-stage overall survival between the EVLD and non-EVLD groups [24]. This study has the major advantage of including significantly more patients, 475 in the EVLD group and 581 in the non-EVLD group, making it more reliable. One of the reasons accounting for these results is probably a decrease in stage migration. The concept behind the stage migration effect is that patients with a low number of retrieved and analyzed lymph nodes are frequently understaged and, for this reason, do not benefit from a proper therapeutic management [15, 27, 28]. In this study, surgical EVLD not only increased the total lymph node count (29 vs. 20, p < 0.001), but increased also the positive lymph node count (6.53 vs. 4.09, p = 0.021). Consequently, applying a more thorough lymphodissection performed by the surgeon has led to an increase in positive lymph node count, a more suitable staging of the disease, and, most importantly, a significant benefit in overall survival for stages II and III [24].

It is known that in many Western surgery centers the minimal requirements for an adequate staging (lymph node retrieval ≥16) are not fulfilled [29,30,31]. Especially in these centers, in order to increase the lymph node yield, we think that EVLD should be used. Taking all the five studies into account, it is important to emphasize that contrary to the non-EVLD groups, almost all patients in the EVLD groups were adequately staged, having a proper number of retrieved and examined lymph nodes, at least according to the GC guidelines. In our study, 5 patients from the non-EVLD group have not been adequately staged, whereas all patients in the EVLD group had a proper number of retrieved lymph nodes.

We argue that EVLD should also be performed in specialized centers where an adequate lymph node count is a rule, because studies have pointed out that harvesting only 16 lymph nodes (LN) is not enough for a correct staging [6,7,8]. Recently, Woo et al. conducted a study which included 25,289 patients and were able to establish the cut-off value, namely 29 LN, over which further LN retrieval would not add any benefit: stage-by-stage patients with ≥29 LN retrieved had a better overall survival than patients with <29 retrieved LN (p < 0.001) [6]. Chen et al. underline that patients with N2/N3 GC and ≥25 LN harvested have a better 5‑year overall survival than N2/N3 patients with ≤25 LN retrieved, regardless of the of lymphodissection type. Moreover, the study concludes that harvesting ≥25 LN is required to correctly stage a patient with N3b GC [7]. Another study highlights similar findings, except that the proposed cut-off value is >22 LN and the 5‑year overall survival benefit was shown only for N3 patients [8].

These higher cut-off values may actually reflect a true D2 lymphodissection, which is the standard procedure for the treatment of GC [32]. Performing a D2 lymphodissection is essential, since, after an intense debate, it has been associated with a better overall survival than D1 lymphodissection [33,34,35]. Moreover, two studies established that an average of 27 [32] or 28 [36] lymph nodes should be identified and analyzed after D2 lymphodissection. This finding has a high significance, since examining less than 25 lymph nodes may actually reflect the following: poor surgical technique, poor pathological assessment, or both. Conversely, assessing ≥25 LN translates a proper D2 lymphodissection and pathological analysis [32].

Some causes have been proposed to explain the higher lymph node yield in the EVLD group: (1) higher surgical diligence regarding lymph node retrieval; (2) palpation differences between adipose tissue and lymph nodes are more obvious right after the surgery [17, 23]. Regardless of the cause, it is important to emphasize that a surgeon-related factor, i. e., EVLD, can change the outcome of a patient. The main reason for which this occurs is that the postoperative retrieval technique of lymph nodes has not been standardized. As a consequence, factors which influence the total lymph node count, including EVLD, should be considered in the staging of GC. The influence of these factors justifies the heterogeneity between the studies included in the meta-analysis. The I2 of 58% points out that the different results regarding the mean difference of total LN are not by chance.

As we already mentioned, the total lymph node count is known to impact on the patient’s survival. One of the most widely used methods to increase the lymph node count implies the use of fat clearing techniques [16]. These have been shown by multiple studies to lead to an increase in total lymph node count compared to the conventional lymph node sampling. Indeed, a systematic literature review and meta-analysis concerning the techniques to increase the lymph node harvest from gastrointestinal cancer specimens pointed out that the use of fat clearing staining increased the mean lymph node yield in all the 18 studies which approached this matter [16]. Most of the studies included in this article focused on colorectal cancers. Nevertheless, these results prove to also be valid for gastric adenocarcinoma [27, 37,38,39]. It is also relevant that by use of fat clearing techniques, more small lymph nodes are identified, which otherwise may not have been detected [16, 37, 39]. Noda et al. showed that ignoring all the lymph nodes with a diameter of 5 mm or less will decrease the metastatic lymph node count by 37.8% [40]. All these results converge towards one conclusion: fat clearing techniques should be used in order to increase the lymph node count, especially by pathologists whose lymph node count is low. Despite the increase in total lymph node yield, it is still uncertain whether the use of fat clearing in gastrointestinal specimens is associated with an increase in positive lymph nodes and a change in the disease staging [16]. Further studies are required.

Which methods and techniques are the most appropriate for gastric adenocarcinoma processing is debatable. Only an optimal, efficient, clear, and reliable protocol for tissue processing can improve the lymph node count and reduce its high variability among pathology departments.

“The surgery of the malignant disease is not the surgery of the organs; it is the anatomy of the lymphatic system” [41], is a more than 100-year-old statement of the British surgeon B.G.A. Moynihan, which emphasizes the importance of the lymphatic drainage in malignant tumors, including GC. In order to accurately stage GC, it is vital to determine the lymph node status. This problem is far from being solved because of the complex gastric lymphatic drainage, but also because of the highly debatable staging system, which has many flaws, including: (1) lack of standardization regarding the lymph node count; (2) ignorance towards the negative lymph node count, (3) towards factors which influence the total LN yield, (4) and towards the concept of micrometastasis (5). In other words, patients with the same positive LN count will have the same pN, but just because two patients have the same pN, does it mean that they actually share the same LN status? If these patients present the same factors which influence the LN count, including EVLD, then yes, they will have the same LN status, but this is never the case.

Variability among factors which influence LN counts should therefore be considered in order to accurately characterize the lymph node status of patients treated for GC. These are patient-related, treatment-related, surgeon-related, and pathology-related factors [14,15,16,17]. It is also important to remind about the complex lymphatic anatomy of the stomach, but also about the altered and unpredictable lymph drainage which occurs in GC [9, 42,43,44]. For instance, a case report presents a patient who was re-staged from stage I to stage IV GC after intraoperative detection of sentinel lymph nodes revealed a single skip metastasis along the middle colic artery, which accounts for M1 disease [45]. The success of GC staging and treatment is dependent on an accurate depiction of the lymph node status. The present staging system, alone, is far from accomplishing this goal. Other more dynamic and personalized models designed to evaluate the lymph node status have been developed, i. e., Maruyama Index [46, 47], Artificial Neural Network Algorithms [48], intraoperative lymphatic mapping, and sentinel lymph node biopsy using radioactive tracer [49]. Furthermore, molecular markers with prognostic value offer an insight into the complex and heterogeneous behavior of GC [50,51,52] and may improve the aforementioned models, enabling a more personalized staging and treatment. It is not by chance that the American Joint Committee on Cancer (AJCC) has recognized the need for a more personalized approach in order to assess the prognosis of patients with cancer [53].

Conclusions

Considering the complex lymphatic drainage of the stomach, we think that only an experienced surgical-pathological team should manage patients with GC, in order to treat the patient and correctly stage the disease. Only by accomplishing these conditions can most of the malignant burden be surgically removed and properly assessed by the pathologist. Moreover, we think that the conducted systematic review and meta-analysis, but also our personal experience regarding this topic, should encourage the surgeon to perform EVLD, in order to achieve an optimal staging of the disease. Also, by performing EVLD, the likelihood of analyzing all the surgically removed lymph nodes increases, which would reflect the quality of the D2 lymphodissection. Finally, the surgical procedure and pathological assessment must be standardized.