Introduction

At present, gastric cancer is still a serious threat to human health and it is the third leading cause of cancer death and the fifth most commonly diagnosed cancer in the world [1]. Surgical resection is considered to be the gold standard of treatment for gastric cancer and open gastrectomy with lymphadenectomy takes a dominant position in the treatment of gastric cancer. Since Kitano et al. [2] reported firstly LG for gastric cancer in 1994, LG has been gradually spread worldwide.

Minimally invasive surgery (MIS) represents a new trend for its unique features. In recent years, LG has been recognized for its advantages of MIS in the treatment of gastric cancer, such as less blood loss, reduced invasiveness, less postoperative pain, earlier recovery of intestinal function, shorter hospital stay, and less complication [3,4,5,6,7,8,9]. Clinical trials comparing laparoscopic with open surgery have shown that laparoscopic radical gastrectomy has the same long-term effects as open radical gastrectomy [10,11,12]. However, conventional laparoscopic surgery has also limitations of itself, including two-dimensional images, decreased sense of touch, amplification of hand tremor, lack of flexibility, and limited ranges of instrument movement. Besides, LG causes more physical stress and requires a long learning curve for surgeons, especially in lymph node dissection [13].

Recently, robot-assisted surgery, an emerging technology, has been used to overcome the technical drawbacks of conventional laparoscopic surgery. Advantages of robot-assisted surgery include high definition 3-D stereo video, convenient movements of the robotic arm, tremor suppression, and stable picture [14,15,16]. Application of the Da Vinci robotic surgical system has unlocked a new era of MIS, and it has been widely used in cardiovascular, urinary tract, hepatobiliary, and gynecological surgery [17]. Since Hashizume et al. [18] reported the first RG in 2002, studies on RG have been widely reported.

Many studies have reported the safety and feasibility of RG, which is meaningful in highlighting the status of RG in the treatment of gastric cancer. However, these studies included small sample size, a single institution design and different appraise system of complications, which limited them to conclude objective result. Therefore, there is no clear conclusion whether RG can achieve an equal or even better surgical effect to LG. We conducted this systematic review and meta-analysis to explore and compare the clinical efficacy of RG and LG.

Methods

Search strategy

The present study strictly complied with the relevant requirements of the PRISMA guidelines and completed the PRISMA checklist [19]. A systematic literature search was performed in Pubmed, Cochrane Library, WanFang, CNKI, and VIP for studies published before May 2020 that compared RG with LG, using the following searching terms: gastric cancer, gastric carcinoma, laparoscopic, robotic, and gastrectomy. In addition, the references of all relevant articles were also searched to find the additional literature. Only the studies in Chinese and English were included.

Inclusion criteria and exclusion criteria

Included studies must meet the following criteria: (1) clinical research comparing RG with LG for patients with gastric cancer; (2) full-text article containing necessary data for statistical analysis, or including at least one of the following clinical outcomes: estimated blood loss, time to flatus, retrieved lymph nodes, operative time, length of hospital stay, proximal and distal margin distance, complications, mortality, OS, RFS, and recurrence rate; (3) if the same authors or center reported two or more studies, the most recent publication, the larger scale number publication or high-quality publication were included. If 2 or more studies included totally different patients from the same center, we still analyzed the datum from those studies.

Articles were excluded if they included any of the following criteria: (1) letters, review articles, conference reports, comments, case reports, and animal experimental studies; (2) articles including non-gastric cancer cases such as gastrointestinal stromal tumors, or benign gastric diseases; (3) articles without necessary data for statistical analysis.

Data extraction and quality assessment of included studies

Two authors independently and carefully reviewed and extracted the effective data from all included studies according to the inclusion and exclusion criteria, and checked the results again. If there was a disagreement, the controversial results were resolved by further discussion, and a final decision was made. The following data were collected from each study: first author, publication year, country, study design, sample size (RG group and LG group), age, body mass index (BMI), extent of resection, estimated blood loss (EBL), time to flatus, retrieved lymph nodes, operative time, length of hospital stay, proximal and distal margin distance, complications, mortality, OS, RFS, and recurrence rate. If the research offered medians and ranges, the means and standard deviations (SDs) were estimated as described by Hozo et al. [20]. The NOS was used to estimate the quality of the included studies (http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp). Scores range from 0 to 9 stars: studies with a score higher than or equal to 7 were considered to be high-quality and included in the meta-analysis, although it was generally believed that studies with a score of 6 or more were high quality.

Statistical analysis

The meta-analysis was performed by using the Review Manager 5.3 software (Cochrane Collaboration, Oxford, UK). Continuous variables were assessed using weighted mean difference (WMD) with a 95% confidence interval (CI) and dichotomous variables using odds ratios (OR) with a 95% CI. The survival data, such as OS and RFS, was assessed using the hazard ratios (HR) and a 95% CI. The I2 statistics was utilized to evaluate the heterogeneity. I2 < 25%, 25% ≤ I2 ≤ 50%, and I2 > 50% were regarded as low, moderate, and high heterogeneity. If the test of heterogeneity was high (I2 > 50% or P < 0.05), a random-effect model was adopted. Otherwise, we used a fix effect model. Funnel plot was utilized to evaluate the potential publication of bias according to the overall complication. P < 0.05 was considered to be statistically significant.

Results

Selected studies

A total of 430 potential articles, which were published before May 2020, were retrieved from our databases. After removing 66 duplicates, 246 studies excluded by carefully reading the titles and abstracts because it was a review, letter, conference report, comment, case report, or animal experimental study. One hundred eighteen potential articles were thoroughly evaluated through full-text articles, and finally, a total of 19 retrospective studies were included in the final meta-analysis according to inclusion and exclusion criteria [21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39]. A flow diagram of the search strategies, which includes reasons for the exclusion of studies, is shown in Fig. 1.

Fig. 1
figure 1

Flow chart of literature search strategies

Study characteristics and quality

Nineteen studies with a total of 7275 patients, of which 4598 patients were in the LG group and 2677 in the RG group, were involved. Fourteen of the included studies were published in English [23,24,25,26,27,28,29,30, 33,34,35,36,37,38], and 5 published in Chinese [21, 22, 31, 32, 39]. Among the 19 studies, 13 were from China [21,22,23,24, 29, 31,32,33, 35,36,37,38,39], 4 from Korea [25,26,27, 30], and 2 from Japan [28, 34]. The basic characteristics of the included studies are listed in Table 1. The evaluation of quality according to the NOS is shown in Table 2. NOS shows that 6 out of the 19 studies observed had 9 stars [24,25,26, 32, 34, 35], 2 had 8 stars [27, 33], and 11 had 7 stars [21,22,23, 28,29,30,31, 36,37,38,39].

Table 1 Main characteristics of studies included in the meta-analysis
Table 2 Assessment of the quality of the studies based on the NOS

Short-term outcomes

Figs. 2, 3, 4, and Table 3 show the results of meta-analysis for short-term and long-term outcomes. Eighteen studies reported the operative time. Because there was significant heterogeneity between 18 studies (I2 = 94%, P < 0.001), a random effect model was adopted. Meta-analysis revealed that the operative time was longer for RG than for LG (WMD = −32.96, 95% CI −42.08 ~ −23.84, P < 0.001) (Fig. 2a). The EBL was reported in 17 studies. Because of significant heterogeneity (I2 = 81%, P < 0.001), a random-effect model was used. The meta-analysis showed that the EBL was lower in RG than LG (WMD = 28.66, 95% CI 18.59 ~ 38.73, P < 0.001) (Fig. 2b).

Fig. 2
figure 2

Forest plot of the meta-analysis for intraoperative and postoperative parameters. a Operation time. b Estimated blood loss. c Time to first flatus. d Length of hospital stay. e Overall postoperative complications. f Mortality

Fig. 3
figure 3

Forest plot of the meta-analysis for pathology details. a Number of retrieved lymph nodes. b Proximal margin distances. c Distal margin distance

Fig. 4
figure 4

Forest plot of the meta-analysis for survival outcomes. a Overall survival. b Relapse-free survival. c Recurrence rate

Table 3 Results of the meta-analysis

Pooled analysis showed that the number of days to first flatus of RG was shorter than LG, with a high heterogeneity (WMD = 0.16, 95% CI 0.06 ~ 0.27, P = 0.003, I2 = 65%) (Fig. 2c). All studies reported the days of hospital stay. A random effect model was used because of significant heterogeneity (I2 = 93%, P < 0.001). The pooled results showed no difference in hospital stay between the RG and LG groups (WMD = 0.23, 95% CI −0.53 ~ 0.98, P = 0.560) (Fig. 2d). All 19 studies presented the overall postoperative complication. Analysis of the index revealed no significant difference between the groups of RG and LG (OR = 1.07, 95% CI 0.91 ~ 1.25, P = 0.430) (Fig. 2e). The pooled result was measured using fixed effects models due to the lack of significant heterogeneity (I2 = 0%, P = 0.880). Moreover, 5 studies, with a total of 2148 gastric cancer patients, reported mortality. Pooled analysis showed no significant heterogeneity (I2 = 0%, P = 0.820) using a fixed effects model. Although no significant difference could be found in mortality between the two techniques, the pooled result revealed that LG group had a higher mortality than RG group (OR = 0.67, 95% CI 0.24 ~ 1.90, P = 0.450) (Fig. 2f).

All of the included studies reported the number of harvested lymph nodes. There was a significant heterogeneity, so a random effect model was adopted (I2 = 83%, P < 0.001). Analysis of the index revealed that harvested lymph nodes were similar between the groups of RG and LG (WMD = −0.96, 95% CI −2.12 ~ 0.20, P = 0.100) (Fig. 3a). Seven studies reported the proximal margin and a fixed effects model was adopted because no significant heterogeneity was observed (I2 = 28%, P = 0.220). The proximal margin was not significantly different between the two groups (WMD = −0.10, 95% CI −0.29 ~ 0.09, P = 0.300) (Fig. 3b). In terms of the distal margin, the difference between the two groups was not also significant (WMD = 0.15, 95% CI −0.21 ~ 0.52, P = 0.410), but the heterogeneity was significant (I2 = 59%, P = 0.030) (Fig. 3c).

Long-term outcomes

The OS outcomes were recorded in 6 studies. Pooled analysis indicated no significant difference between the two techniques (HR = 0.95, 95% CI 0.76 ~ 1.18, P = 0.640), and because of the lack of significant heterogeneity (I2 = 0%, P = 0.860), a fixed effects model was used (Fig. 4a). The RFS outcomes were reported in 3 studies, which included a total of 1172 gastric cancer patients. The pooled results suggested that the RFS outcomes were similar between the RG and LG groups (HR = 0.91, 95% CI 0.69 ~ 1.21, P = 0.530). The analysis had no obvious heterogeneity (I2 = 0%, P = 0.910) using a fixed effects model (Fig. 4b). Five studies reported recurrence rates. The pooled results showed no significant difference in the recurrence rate between the two groups (OR = 0.90, 95% CI 0.67 ~ 1.21, P = 0.500), with no significant heterogeneity (I2 = 0%, P = 0.620) (Fig. 4c).

Sensitivity analysis

We conducted a sensitivity analysis for high-quality papers with more than 7 stars. In terms of the time to first flatus, the results showed that there was significant difference between the two techniques (WMD = 0.15, 95% CI 0.05 ~ 0.24, P = 0.002). The time to first flatus was shorter in RG than LG, with no significant heterogeneity (I2 = 0%, P = 0.900) (Fig. 5). In terms of the number of harvested lymph nodes, the results showed that the number of harvested lymph nodes was more in RG than LG (WMD = −1.04, 95% CI −1.98 ~ −0.10, P = 0.030), and there was no obvious heterogeneity (I2 = 0%, P = 0.430) (Fig. 6).

Fig. 5
figure 5

Forest plot of the sensitivity analysis for the time to first flatus

Fig. 6
figure 6

Forest plot of the sensitivity analysis for the number of retrieved lymph nodes

Publication of bias

A funnel plot of overall complications was utilized to evaluate publication bias. The bilaterally symmetrical funnel plot of overall complications showed that no evidence of publication bias was found (Fig. 7).

Fig. 7
figure 7

Funnel plot of the overall postoperative complications

Discussion

Radical gastrectomy with lymphadenectomy is regarded as gold standard of treatment for gastric cancer [40]. With the developing of minimally invasive techniques, MIS has gained a revolutionized application in gastrectomy. However, for gastric cancer, MIS experiences a controversy focusing on complication and mortality for a long time. MIS increases quality of life, but it should be ensured that this technique does not increase complication and mortality, especially the new technique-RG [41]. Many studies have compared the safety and short or long term efficacy of LG with open gastrectomy [42,43,44,45,46], but studies on RG have not been sufficient to show the effectiveness. We included 19 studies and performed a meta-analysis to explore and compare the clinical efficacy of RG and LG.

The results of meta-analysis suggested that RG was associated with longer operative time, compared with LG. On one hand, the reason might come from time of setting and docking the robotic arms, which resulted in a longer operative time [47]. Studies had shown that it took about 30 min to prepare for robotic surgery [48]. On the other hand, the difference of the experience of surgeons might cause a longer operative time. Previous research reported that the operative time for RG decreased between the initial RG and gastrectomies performed after experience had been gained [49,50,51]. Woo et al. [52] reported 236 cases of robotic gastrectomy and found that the mean operative time was reduced from 233 to 219 min when compared with the previous 100 cases. Song et al. [53] described that after 25 initial learning cases, the time for docking and setting up the robotic arm was shortened and kept stable, about 15 min. Therefore, docking times can be shortened after accumulation of greater experience. In addition, the learning curve for RG can increase the operative time. With the development of the Da Vinci robotic surgery system, more experience, and a shortened learning curve, can make the robot surgery more and short time.

Blood loss during minimally invasive gastrectomy mainly occurs during lymph node collection and is caused by vascular damage. The meta-analysis indicated that the blood loss was lower in RG than LG. The reason may be that robotic surgery has a high-definition visual field, eliminates hand tremors, and accurately reveals the small structure around the stomach, which helps surgeons better control bleeding in small blood vessels. Time to first flatus is a potential factor that should have an important impact on postoperative recovery. The results of meta-analysis suggested that there was a significant difference in time to first flatus, which was different from the results of previously published studies [17, 47, 54]. Therefore, in order to explore the reasons for the differences between the results, we conducted a sensitivity analysis according to the method described by Abraham et al. [55]. Abraham et al. stated that the results of combining high-quality non-randomized controlled trials were also convincing when comparing the short-term effects of surgery. The results of the sensitivity analysis showed that there was still a significant difference between the two groups, which indicated that the results of our study are reliable. The time to first flatus was shorter in RG than LG, which might be associated with the stable and flexible movements of the robotic arms, avoiding excessive traction on the tissue and accidental injury to the blood vessels, and less trauma to the patients [56]. In addition, the application of the concept of enhanced recovery after surgery (ERAS) in perioperative management may be another important reason for the significant difference in results. Zhang et al. [21] used this method to manage patients during the perioperative period and found that the time to first flatus in the RG group was significantly shorter than those in the LG group. Further prospective research is needed in order to confirm these advantages. However, the results of the meta-analysis showed that the potential factor could not cause the different postoperative hospital stay between the groups of RG and LG. There was no statistical difference between the two groups on hospital stay, but it seemed to prefer the RG.

The postoperative complication rate is an important indicator of the short-term outcome. This meta-analysis indicated that the incidence of overall complications in the group of RG was less than in the LG group, although no statistical difference. Regarding the mortality, analysis of the pooled data of the included studies suggested that mortality did not differ significantly between the two groups. According to these results, we believe that RG is safe and acceptable.

The result of tumor pathology is the key to evaluate the success of gastric cancer operation. This meta-analysis revealed that there was no significant difference in proximal margin and distal margin between the two groups. Radical gastric cancer surgery requires extensive lymph node dissection, which helps to more accurately assess the gastric cancer staging and prognosis of the patients. Regarding the number of harvested lymph nodes, analysis of the pooled data of the included studies revealed that the number of harvested lymph nodes was similar between the two groups, with no statistical difference. Our results were similar to the results of previously published studies [17, 47]. Recently, Guerrini et al. [54] published the largest meta-analysis of robotic versus laparoscopic gastrectomy for gastric cancer. The results showed that there was a significant difference in the number of harvested lymph nodes between the two groups, which was contrary to our results. Therefore, we conducted a sensitivity analysis by combining high-quality studies to explore the reasons for the opposite results. The results of the sensitivity analysis showed that there was a significant difference between the two groups. It was found that RG was associated with a significantly increased number of harvested lymph nodes, compared with LG. The main reason is that RG has three-dimensional imaging, a tremor filter, and an internal articulated EndoWrist with 7 degrees of freedom, which contribute to precise dissection and lymphadenectomy, especially the lymph nodes of the soft tissue around the gastric vessels [17]. Moreover, it may also be related to the continuous advancement of the robotic surgery system and the improvement of the proficiency of surgeons in its operation [57]. According to the standard of radical gastric cancer surgery, whether in the resection of the primary tumor or lymph node dissection, RG can achieve the goal of radical gastrectomy. However, large-scale and multi-center clinical randomized controlled trials are needed to provide more reliable evidence for clinical treatment in the future.

Because gastric cancer is a malignant tumor, the long-term follow-up oncological outcomes of gastric cancer patients were major concerns of surgeons. OS is a major oncologic outcome. In this meta-analysis, the OS was similar to that previously reported [58]. The pooled data of the included studies revealed no significant difference between the RG and LG groups in OS, RFS, and the recurrence rate without heterogeneity. These results showed that the two techniques had similar long-term oncologic outcomes. As far as we know, few meta-analyses had previously reported RFS with RG and LG. The RFS and recurrence rate results further demonstrated the comparability between RG and LG as far as long-term oncological outcomes in this meta-analysis. These results confirmed that in terms of oncologic outcomes, RG is a safe technique for the management of gastric cancer.

When considering these results in clinical application, several limitations need to be taken into account. First, our meta-analysis included a large number of patients, but all studies included for analysis were retrospective studies, and none were randomized controlled trials, which influence the quality of meta-analysis and result in publication bias. However, no significant publication bias was shown in this meta-analysis. Second, some studies did not describe HRs and SDs directly. These data were extracted from the survival curves, which could cause a potential source of bias. Third, most of included studies were from East Asian countries, and the data regarding Western countries was limited. The generalizability and applicability of these results were limited. These results must be interpreted with caution. Finally, we found that the heterogeneities of operative time, blood loss, and number of retrieved lymph nodes were all significant. These parameters could be influenced by the experience of surgeons.

Conclusion

In conclusion, the results suggested that RG is as acceptable as LG in terms of short-term and long-term outcomes. Overall, our meta-analysis revealed that RG is an effective, safe, and promising approach in the treatment of gastric cancer, and makes up for the defects of laparoscopy, which can make patients have less trauma and quicker recovery. More randomized clinical trials are still essential to further indicate the value of the robotic surgery for gastric cancer.