Introduction

Gastric cancer (GC) is the fifth most common malignant neoplasm and the third leading cause of cancer related deaths globally and for resectable gastric cancer (GC) patients the recommended surgical procedure is the standard gastrectomy with D2 lymphadenectomy [1]. the Japanese, Korean, Italian, German and British national guidelines recommend D2 procedure as the standard of surgical treatment with curative intent, as reported by the European Society for Medical Oncology (ESMO) guidelines, as well as the joint ESMO—European Society of Surgical Oncology (ESSO)—European Society of Radiotherapy and Oncology (ESTRO) guidelines [2].

During the last decades, minimally invasive surgery of the stomach has become increasingly employed worldwide. Laparoscopic gastrectomy (LG) has been routinely used for the treatment of GC, supported by strong evidence that LG is technically safe and leads to better short-term outcomes than conventional open gastrectomy for early stage gastric cancer [3,4,5,6,7,8,9,10,11,12]. However, diffusion of laparoscopic surgery is limited by technical difficulties regarding total gastrectomy procedure as well as D2 lymphadenectomy, that entails the removal of node stations along the celiac trunk, left gastric artery and hepatic pedicle [13, 14]; these factors limit the execution of a correct D2 spleen-preserving laparoscopic gastrectomy (LG) for the treatment of advanced gastric cancer only to high-volume centres.

Since the first robot-assisted gastrectomy reported by Hashizume et al. in 2003 [15], robotic gastrectomy (RG) is claimed to facilitate complex reconstruction after gastrectomy and lymph node dissection, to assure oncologic safety also in advanced gastric cancer patients [16,17,18]. In current literature, many observational studies reported the effectiveness and safety of RG [19,20,21,22,23] and previous meta-analyses [24,25,26] highlighted a lower complication rate as well as bleeding in the robotic approach group when compared to the laparoscopic one.

Since several previous systematic reviews on the comparison of RG and LG are available and timely evidence is required to inform the scientific community, we believed a de novo systematic review was inappropriate, and, as reported in our published protocol [27], we performed a comprehensive umbrella review to collect and assess information from previous systematic reviews that have compared the laparoscopic with robotic gastrectomy.

Umbrella reviews are syntheses of existing systematic reviews and/or meta-analyses providing an ideal method to comprehensively review the evidence base and to explore the contradictory findings of previous reviews [28].

The aim of this review is to investigate the benefits and harm of robotic gastrectomy compared with laparoscopic approach searching between the findings of high-quality systematic reviews, to give surgeons and policymakers a comprehensive overview of the depth and strength of the scientific evidence to evaluate the feasibility of the robotic gastrectomy for gastric cancer.

Methods and analysis

This umbrella review was designed using the methodology guidelines for umbrella reviews provided by the Joanna Briggs Institute [28]. As well, we followed the Preferred Reporting Items for Systematic review and Meta-Analyses (PRISMA) guidelines [29] (Appendix 1). The protocol has been registered with PROSPERO (no. CRD42019139906) and has been published [27]. The review was performed following the protocol without deviation.

Search strategy, study selection and data collection

We searched for systematic review and meta-analysis comparing the outcomes of robotic gastrectomy (RG) and laparoscopic gastrectomy (LG) in patients with gastric cancer. A literature search was conducted in PubMed, Cochrane and Embase databases for all articles published up to December 2019. The “related article” function from PubMed will be used to further identify potential articles that were eligible for inclusion in the review. The bibliography of all selected articles will be hand searched to identify additional articles that met our inclusion criteria [27].

Two independent reviewers (LM and DF) had screen titles, abstract and full-text records in duplicate. Data were extracted by two authors (LM and DF), who independently reviewed and screened all eligible studies for content according to the inclusion criteria indicated in the protocol. We extracted only data pertaining to the comparison between RG and LG. The quality of the included studies was assessed using the appropriate AMSTAR (A Measurement Tool to Assess Systematic Reviews) [30] checklist by the two reviewers: of the included studies only one had scored 6 points on the AMSTAR check list, the others had scored 7 or more points (Table 1).

Table 1 AMSTAR score for included meta-analyses

Statistical analyses

For each meta-analysis, we estimated the summary effect size and its 95% CI using random-effects models. For the largest study of each meta-analysis, we estimated the SE of the effect size and we examined whether the SE was less than 0.1. In a study with SE of less than 0.1, the difference between the effect estimate and the upper or lower 95% confidence interval is less than 0.2 (i.e., this uncertainty is less than what is considered a small effect size). In case of meta-analyses with continuous data, the effect estimate was transformed to an odds ratio with an established formula [31]. Between-study heterogeneity was assessed via the I2 metric. I2 ranges between 0 and 100% and is the ratio of between-study variance over the sum of the within- and between-study variances. Values exceeding 50% are usually considered to represent large heterogeneity.

We evaluated whether there was evidence for small-study effects using the Egger p test [32] A P value less than 0.1 with more conservative effect in larger studies judged to be evidence for small-study effects. We applied the excess statistical significance test, which evaluates whether the observed number of studies with nominally statistically significant results.

Finally, we identified outcomes that had the strongest statistical support for association and no signals of high heterogeneity or bias. Specifically, we used the following categories:

  • convincing (class I) when number of cases > 1000, p < 10 − 6, I2 < 50%, 95% prediction interval excluding the null, no small-study effects and no excess significance bias:

  • highly suggestive (class II) when number of cases > 1000, p < 10 − 6, largest study with a statistically significant effect and class I criteria not met;

  • suggestive (class III) when number of cases > 1000, p < 10 − 3 and class I–II criteria not met;

  • weak (class IV) when p < 0.05 and class I–III criteria not met;

  • non-significant when p > 0.05.

The statistical analysis and the power calculations were done with STATA version 12.0.

Results

Search strategy

One hundred and fifty-six records were found from our literature search. (Appendix 2). Of these, 137 were excluded after a rapid screening of title and abstract. The other five articles were excluded after full-text screening. In total, we selected 14 meta-analyses (Table 2) The full list of the included studies is available in Appendix 3.

Table 2 Studies characteristic

Review characteristics

All the 14 included meta-analyses [23, 33,34,35,36,37,38,39,40,41,42,43,44,45] compare short-term outcomes between laparoscopic and robotic total/subtotal gastrectomy with curative intent in adult patients with diagnosis of resectable gastric cancer (Table 3).

Table 3 Visive indicator of the outcomes

Every study compares the short-term outcomes of robotic surgery with the laparoscopic approach in terms of operation time, blood loss, number of harvested lymph nodes and length of hospital stay. Eleven studies [23, 34,35,36,37,38, 40,41,42, 44, 45] compare the total complication rate after gastrectomy and, of these, three [23, 37, 39] analyse the conversion rate to open technique, five [38, 39, 41, 43, 44] the anastomotic leakage rate, three [38, 39, 41] the anastomotic stenosis rate, two [38, 39] intestinal occlusions, just one [38] the post-operatory bleeding.

Seven [23, 33, 35, 37,38,39, 43] meta-analyses consider the post-operatory mortality rate, three [33, 39, 43] the morbidity rate.

Only three [34, 35, 39] studies report the time to the first oral intake and five [23, 34, 35, 37, 39] the first time to flatus.

The oncological outcomes in terms of total retrieved lymph nodes were compared in selected meta-analyses. Interestingly, nine [23, 35,36,37,38,39, 41, 42, 44] studies report the proximal and distal margin of resection. Only one meta-analysis [37] reports the 3-year overall survival and the 3-year disease-free survival and another one [35] reports the recurrence free survival.

Summary of outcomes

In the following paragraph, we describe the findings from the included meta-analyses. In each review we found data for the primary outcomes: operation time, intraoperative bleeding, length of stay and number of harvested lymph-nodes. Along the way, we also analysed other outcomes findings in the selected studies as conversion rate, mortality rate, morbidity, total complication rate, anastomotic leakage, anastomotic stenosis, intestinal obstruction, proximal and distal margin, time to first flatus and for oral intake.

For each review we extracted, for continuous variables, the weighted mean difference (WMD), the 95% confidence interval (95% CI) and the heterogeneity. For the discrete variables, we reported odd ratio (OR), the 95% confidence interval and the heterogeneity (Table 4).

Table 4 Outcomes

Primary outcomes

Operative time

All selected meta-analyses report higher operative time for the robotic group compared to laparoscopy. A statistically significance reduction is reported in eleven studies [23, 33, 35,36,37, 39,40,41, 43,44,45]. It was an unsurprising result due to extra time for robotic docking and undocking as reported in literature [20].

Intraoperative bleeding

All studies report a reduction of intraoperative bleeding in the group of robotic gastrectomy compared to laparoscopic gastrectomy and only in three there is not a statistical significance [34, 41, 45]. This result could be explained with the operatory field magnification obtained with robotic three-dimensional optic, associated with the higher precision in the small movements and the flapping filters of robotic system [46].

Length of hospital stay

Each meta-analysis report a small reduction in terms of hospital stay for patients who underwent robotic gastrectomy, except for Wang Z et al. [45] that report a negligible increase in hospitalization for RG groups. These results are strengthened by statistically significance only in three meta-analyses [23, 39, 44].

Number of harvested lymph nodes

The number of harvested lymph nodes is a significant value for the oncological outcomes of gastrectomy: in 12 meta-analyses [23, 33,34,35,36,37,38, 40, 42,43,44] this number is higher for robotic technique and only in two studies [41, 45] is lower than laparoscopic gastrectomy. Only two meta-analyses show statistical significative results [23, 44] and report a higher number of harvested lymph nodes.

Secondary outcomes

Overall complication rate

Eleven meta-analyses [23, 34,35,36,37,38, 40,41,42, 44, 45] reported the incidence of overall complication after surgery and no significant difference between robotic and laparoscopic gastrectomy in terms of incidence of total complications was reported.

Proximal and distal margin of resection

Nine studies [23, 36,37,38,39,40, 42, 43, 45] analysed the difference between the length of proximal and distal margin of resection from the tumour: the proximal margin was more distant in patients with RG in seven studies[35,36,37,38,39, 42, 44], in only one [41] the distance was substantially the same and in another one [23] it was higher in patients who underwent LG. Overall, these data does not show statistical significance. The distal margin was more distant in RG in five studies [36, 37, 39, 42, 44], in four [23, 35, 38, 41] the distance was higher in the LG: three studies [39, 42, 44] with statistical significance report an increased length of free distal margin of resection and only one [41] a reduction of this parameter.

Mortality and morbidity

Six meta-analyses [33, 35, 37,38,39, 43] report higher mortality rate in RG, while Hu LD et al. [23] report lower mortality, with no statistical significance. Only three studies report the morbidity rate after gastrectomy with conflicting results. Xiong et al. [33] outlined a lower morbidity in RG, while others [39, 43] found opposite results, with no statistical significance either way.

Anastomotic leakage, anastomotic stenosis and intestinal obstruction

Five studies report anastomotic leak rate: four papers [38, 39, 41, 43] indicate a higher incidence of anastomotic leakage in patients who underwent RG and one [44] a lower incidence. Three meta-analyses [38, 39, 41] outline a lower incidence of anastomotic stenosis and two [38, 39] a higher recurrence of intestinal occlusion. All these findings are not significant from a statistical point of view.

Time to first flatus and oral intake

Five meta-analyses [23, 34, 35, 37, 39] investigate the time to first flatus and only three [34, 35, 39] the time for the first oral intake after surgery: all the studies indicate a shorter recovery of bowel function in patient underwent to RG and, of these, one [23] is statistically significant for time to flatus and all [34, 35, 39] are such for time to oral intake.

Conversion to open surgery

The risk of conversion to open surgery is higher in RG for all three studies [23, 37, 39] that investigate this issue. Nevertheless no statistical significance is reported.

Stratification of evidence

Based on the previously reported classification method we obtained three different levels of evidence for each outcome analysed in the review: only the oral intake is supported by suggestive evidence, operation time and blood loss are supported by weak evidence and the other outcomes are classified as non-significant (Table 5).

Table 5 Stratification of evidence

Discussion

Main findings and interpretation in light of existing evidence

In this umbrella review of systematic reviews and meta-analyses evaluating the current evidence for potential benefits and harm associated with robotic gastrectomy compared to laparoscopic gastrectomy for gastric cancer, we summarized 14 studies covering overall 146 primary studies, and more than 37,500 subjects. Our assessment did not show an overall excess of findings with significant results, by contrast with other medical specialties, in which an excess of significant results is reported [47,48,49]. In our study, a large proportion of the examined meta-analyses had a not large heterogeneity and some studies had a large heterogeneity.

The applied Egger test is particularly difficult to interpret when between-study heterogeneity is large. Heterogeneity might often be a manifestation of bias in some studies of a meta-analysis, but could also emerges from genuine differences across studies. Some reasons for heterogeneity include the mixture of cohort studies and case–control studies in some of the meta-analyses, differences in the populations analyzed, in the reproducibility of the surgical technique and in the stage of gastric cancer.

The outcomes reported for each surgical technique need to be interpreted with caution, in particular for the meta-analyses in which the heterogeneity is large, the number of studies is relatively small, the largest study is more conservative than the summary effect.

According to statistical data analyses, robotic gastrectomy is associated with shorter time to oral intake with a high level of evidence. The data regarding lesser intraoperative bleeding and longer operation time for robotic approach are supported by weak evidence. On the other hand, the data regarding other outcomes are insufficient as well as non-significant, from an evidence point of view, to draw any robust conclusion.

As observed in each selected meta-analysis, intraoperative blood loss was significantly lesser in the RG than in the LG groups. From a theoretical point of view, robotic procedure is a more precise technique that could help surgeons visualize small vessels. Furthermore, the robotic arms are more stable than a surgeon's hands, leading to a significant reduction of musculoskeletal fatigue and physiologic tremor over time in surgeons [36]. In addition, the improved dexterity of an internal articulated wrist provides greater flexibility in a restricted operative field, and the stereoscopic vision enables surgeons to effectively minimize the risk of tissue and blood vessel injuries and intraoperative bleeding as well. In the end, we found strong evidence for intraoperative bleeding, shedding light on this benefit of robotic gastrectomy when compared with LG.

Thirteen studies showed that the hospital stay in the RG groups was negligibly shorter (nearly a day) than that in the LG groups, reaching statistical significance in only three meta-analyses. Similarly, other potential factors that should have an important impact on postoperative recovery, such as time to diet and first flatus resulted shorter in RG groups. Based on these results, we postulate that the faster recovery of patients receiving robotic approach induced the different postoperative hospital stay between the 2 groups. The evidence from our study is highly suggestive for these benefits; therefore, surgeons and policy makers should consider the robotic approach as an acceptable option in treating gastric cancer.

As prognostic factors of surgical therapy from an oncological point of view, the number of resected lymph nodes as well as the length of resection margins cannot be ignored. In our umbrella review, even if the number of retrieved nodes obtained with the robotic gastrectomy was higher, the significance was negatively affected by the low value of evidence stratification. Several studies [23, 33,34,35,36,37,38, 40, 42,43,44] report that the number of retrieved lymph nodes during extra-perigastric lymphadenectomy, especially in the case of splenic pedicle and splenic hilum and in the supra-pancreatic areas, was significantly higher for the robotic group compared to the laparoscopy group. However, it appears clear that the operative steps of lymph node dissection robotically performed are generally the same as those in laparoscopy. We could postulate that the traditional straight laparoscopic instruments fail to help surgeons reach deep-seated vessels and such nodal areas. In addition, the tremor filtering, wristed instruments, as well as stable exposure and high-resolution image enable surgeons to execute thoroughly surgical maneuvres thoroughly [50]. However, most of the primary studies included patients who underwent both subtotal gastrectomy and total gastrectomy without distinction, and the stage of gastric cancer was also not the same for all of the enrolled patients. Anyway, since case-matching studies according to the type of gastrectomy and the extension of lymphadenectomy comparing robotic and laparoscopic approach are needed to reduce the bias, given that only two out of 12 meta-analyses reached statistical significance, surgeons and policy makers should cautiously consider the marginal superiority of robotic gastrectomy in lymph-node retrieval. In addition, evidence for difference in margins between robotic and laparoscopic groups is only suggestive. As a pathological parameter, the proximal margin was longer in the RG group, while distal margin resulted in controversial results between the two groups. These findings may open up new research directions.

The prolonged operating time in RG was shown in all the included meta-analyses, preluding a negative impact on postoperative outcomes due to prolonged exposure time to pneumoperitoneum and the associated increased anesthesia time. However, previous studies investigating the effect of longer operation time in patients receiving laparoscopic gastrectomy did not show detrimental surgical results [35]. One of the most important reasons of prolonged time is that robotic gastrectomy requires “setting and docking” time for the robot, which inevitably results in a longer operative time requiring almost 30 min of extra time [51,52,53,54]. In addition, the learning curve for robotic gastrectomy significantly affect the time spent during surgery. Eom BW et al. [21] stated that intervention time was reduced after at least 15 robotic gastrectomies. On the same way, Woo JH et al. [24] demonstrated a reduction of the mean operative time from 233 to 219 min after the execution of 100 cases. Anyway, considering the development of the robotic surgery systems, more experience, and a shortened learning curve, we can postulate that RG is technically feasible in regard to operation time.

Interestingly, the prolonged operation time of RG was not associated to any increase in postoperative complications, mortality, or conversion rate. It is postulated that technical advantages such as 3D vision and tremor filtering could contribute to safer implementations of the robotic system for gastric surgery [21, 47].

Due to limited meta-analyses included, an umbrella review for cost evaluation was not performed. But Hyun et al. [41] and Chen et al. [35] report that the RG costs an average of €3,189 and 3900 USD, respectively, compared to the LG, of which most of this amount, around €2831 is determined from the DaVinci robotic system itself. On the basis of what was reported by both authors, the possible advantages of the robotic approach would not be justified by the higher cost but looking at the set of costs related to hospitalization, we come to the conclusion that the higher operating costs are finally offset by the reduction of complications and of hospitalization time.

The results from primary studies are consistent with the findings of our umbrella review as we found highly suggestive evidence that RG and LG are equivalent as regard the safety and feasibility, considering the robotic approach as a safe and non-inferior option in treating gastric cancer toward LG.

Strengths and limitations

We performed this detailed umbrella review to assess the benefits and harm of robotic gastrectomy compared with laparoscopic approach. In addition, we used a comprehensive and systematic criterion to grade evidence levels to rate the strength of these systematic reviews and meta-analyses. Our review inevitably has limitations and drawbacks. First, we fully trust the accuracy of the data provided in the included meta-analyses. As such, problems within the published data may impact the evidence-rating results despite our analyses. All the meta-analyses included in this review compared retrospective non-randomized studies and until now no randomized clinical trials (RCTs) are available between RG and LG. Another limitation is that significant heterogeneity was recognized in some characteristics of the primary studies. Several papers included patients who underwent both subtotal gastrectomy and total gastrectomy without distinction. The stage of gastric cancer was also not the same for all of the enrolled patients. The majority of the studies were from Eastern populations, whereas the minority were from Europe. The classification of evidence supporting the single outcome highlighted how no outcomes analysed in the creation of this review are supported by convincing or highly suggestive evidence.

On the other hand, we are convinced of the strength of our umbrella review, since the methodological quality of all included systematic reviews and meta-analyses were considered critically high.

Conclusions

In conclusion, the safety and efficacy of robotic gastrectomy are not clearly supported by strong evidence, suggesting that the outcomes reported for each surgical technique need to be interpreted with caution, in particular for the meta-analyses in which the heterogeneity is large. Certainly, robotic gastrectomy is associated with shorter time to oral intake, lesser intraoperative bleeding and longer operation time with an acceptable level of evidence. On the other hand, the data regarding other outcomes are insufficient as well as non-significant, from an evidence point of view, to draw any robust conclusion.