Introduction

Gastric cancer (GC) is the fifth most common cancer and the fourth leading cause of cancer-related death worldwide [1]. Surgical resection is the only curative treatment approach, with regional lymphadenectomy recommended as part of radical gastrectomy [2]. Laparoscopic gastrectomy (LG) is increasingly used because of its beneficial short-term effects compared with open gastrectomy [3].

The da Vinci surgical system (DVSS) was developed to overcome the disadvantages with standard minimally invasive surgery using a laparoscope [2]. In our previous multi-institutional prospective study (UMIN000015388)—which was approved for Advanced Medical Technology (“Senshiniryo B”) managed by the Ministry of Health, Labour, and Welfare (MHLW)—we successfully showed that robotic gastrectomy (RG) for cStage I/II GC reduced the morbidity rate (Clavien–Dindo classification [C–D] grade ≥ IIIa) of LG to less than half of that in a historical control with data from three leading institutions of LG, including Kyoto, Saga, and Fujita health universities [4]. Consequently, the MHLW decided to recognize 12 more robotic procedures, including RG, as part of their corresponding conventional minimally invasive procedures from the standpoint of medical insurance coverage as of April 2018 [5]. Since then, the number of RGs have dramatically increased nationwide, although the following requirements were set for the surgery to be covered by the insurance, to restrict the introduction of surgeries in inexperienced institutions: (1) the operating surgeon should be qualified by the Japan Society for Endoscopic Surgery (JSES) endoscopic surgical skill qualification system (ESSQS) [6], as well as board-certified in gastroenterology by the Japanese Society of Gastroenterological Surgery (JSGS). (2) The operating surgeon should have performed more than 10 RGs, including robotic distal, proximal, and total gastrectomy. (3) The facility must have performed more than 50 gastrectomies including 20 laparoscopic or robotic distal, proximal, or total gastrectomies during the past year [5, 7]. Actually, in the Japanese universal health insurance system, the services covered and the fees set for physicians and hospitals have been uniform across the nation [8]. A patient basically pays 30% of the fee schedule price, and, when the monthly co-payment exceeds a threshold amount, the co-payment for a patient in an average-income family is decreased to 80,100 + (medical expense − 267,000) × 0.01 JPY/month (high-cost medical expense benefit) [8, 9]. In the meantime, the Japanese government prohibits joint provision of medical treatments covered by the universal health insurance and those not covered [2]. Therefore, a patient who undergoes uninsured RG is required to pay approximately 2,200,000 JPY, whereas a patient who undergoes insured RG, similar to those who undergo LG, is charged approximately 100,000 JPY during perioperative admission in any hospital.

The Japanese National Clinical Database (NCD), which started its data registration in 2011, has grown into a large nationwide database covering more than 95% of the surgeries performed by regular surgeons in Japan [10]. As of the end of December 2019, 5276 facilities have been enrolled in the NCD, and approximately 1,500,000 cases have been registered every year [10]. Since enrolled cases are linked to a lifelong board certification system for surgeons, data registration to the NCD is mandatory for teaching hospitals and for community hospitals with surgical departments throughout Japan [11]. The NCD has collected demographical data, procedural details, and perioperative variables that are almost identical to those of the American College of Surgeons’ National Surgical Quality Improvement Program [12]. In October 2018, the NCD-based prospective registry system for patients who were scheduled to undergo RG was launched under the leadership of the MHLW and JSES, and the registration of each patient to obtain insurance coverage for the surgery was mandated [5]. Therefore, the aim of the present study was to determine, by examining data from the NCD, whether RG under the national insurance program has been safely implemented nationwide.

Materials and methods

Data source

The NCD collects data on patient demographics, pre-existing comorbidities, preoperative laboratory values and perioperative data, including the clinical course for up to 90 days after surgery [12]. Participating institutions can access all NCD variables, definitions, and inclusion criteria online [13]. An annual educational meeting for data managers and an e-learning system to achieve consistency in data entry is also provided. Data consistency is validated through inspections of randomly chosen institutions and assurance of data traceability using the web-based data management system [12]. Clinical staging was performed preoperatively according to the 8th edition of the Union for International Cancer Control-TNM classification [14].

Study design and cohort development

Consecutive patients with primary gastric cancer who underwent minimally invasive total or distal gastrectomy (robotic total gastrectomy, RTG; laparoscopic total gastrectomy, LTG; robotic distal gastrectomy, RDG; laparoscopic distal gastrectomy, LDG), performed by a surgeon qualified by the JSES ESSQS, between October 2018 and December 2019 were retrieved from the gastrointestinal surgery section of the NCD. Distal gastrectomy (DG) includes pylorus-preserving gastrectomy and segmental gastrectomy. Those who underwent proximal gastrectomy (PG) were not enrolled because detailed short-term outcomes of PG were not recorded in this database. The JSES provided the JSGS, who administrates the gastroenterological section of the NCD, with a list of the medical license number of each surgeon qualified by the JSES ESSQS. The following exclusion criteria were set to focus on the surgical outcomes of gastrectomy with curative intent: (1) cStage IV (cT4b or cM1); (2) cTx, Nx, or Mx; (3) esophagogastric junction cancer with a length of esophageal invasion > 30 mm; (4) emergency surgery; (5) disseminated cancer; (6) concurrent surgical procedures except for cholecystectomy, splenectomy, enterostomy, local resection of the stomach, esophageal hiatal hernia repair, fundoplication, and central venous port placement; and (7) patients who declined publication of their treatment information or had insufficient follow-up.

Selection of quality indicators and confounding factors

Consensus meetings were held by a study team consisting of surgeons and biostatisticians to determine quality indicators, adjust for confounding factors, and enable the comparison of surgical outcomes between RG and LG using propensity score matched analysis (PSM). The primary outcome was the morbidity rate within 30 days of surgery, determined by C–D grade IIIa or higher [15]. We selected this outcome measure, because postoperative complications requiring surgical, endoscopic, or radiological intervention, which correspond to C–D grade IIIa, remarkably extend the admission period, threaten the patient’s life, and increase medical cost [15,16,17]. The secondary outcomes were surgical outcomes, including open surgery conversion rate, incident of intraoperative adverse events (cardiopulmonary arrest, myocardial infarction), operative time, estimated blood loss, massive intraoperative bleeding (≥ 1000 mL), intraoperative use of blood transfusion, and intraoperative use of red blood cell transfusions, and surgical curativity (R0, R1, R2); incidence of each postoperative complication, including intra-abdominal infectious complications (anastomotic leakage, pancreatic fistula, intra-abdominal abscess), other local complications (pancreatitis, superficial incisional surgical site infection, deep incisional surgical site infection, wound dehiscence, intra-abdominal bleeding, anastomotic stenosis, functional and mechanical small bowel obstruction, anastomotic ulcer, gastroduodenal ulcer, and intestinal bleeding), and systemic complications (pneumonia, peritonitis not caused by intra-abdominal infectious complications, unexpected endotracheal intubation, pulmonary embolism, ventilator dependency, cardiac arrest requiring cardiopulmonary resuscitation, postoperative blood transfusion, deep venous thrombosis, sepsis, atelectasis, cardiac decompensation, disseminated intravascular coagulation, pleural empyema, tracheal necrosis, liver failure, refractory ascites, and liver abscess); duration of intensive care unit stay; duration of postoperative hospitalization; reoperation rate within 30 days after surgery; readmission rate within 30 days after surgery; 30-day mortality, defined as any death within 30 days after surgery; in-hospital mortality within 90 days after surgery; and surgical mortality, which included all patients who died within 30 days of operation or those who died during hospitalization within 90 days after surgery.

Preoperative factors that served as a basis for determining the allocation to either robotic or laparoscopic surgery were identified to estimate propensity scores. Several additional risk predictors identified in a previous study were also included in the model [11, 18,19,20,21,22]. As a result, covariates for propensity score estimation included patient age at the time of surgery, sex, body mass index, the American Society of Anesthesiologists physical status (ASA-PS), activities of daily living, smoking status, and presence of habitual alcohol intake. Furthermore, we also included preoperative conditions, including weight loss greater than 10% within the past 6 months; presence of comorbidities such as any respiratory distress, mechanical ventilator dependency within 48 h of operation, insulin-dependent diabetes mellitus, chronic obstructive pulmonary disease, hypertension within 30 days of operation, congestive heart failure within 30 days of operation, angina within 30 days of operation, history of myocardial infarction within 6 months of surgery, previous cardiovascular surgery, need for preoperative dialysis within 14 days of operation, previous cerebrovascular accident, chronic steroid use, bleeding risk factors, preoperative sepsis, and laboratory data (white blood cell count, hemoglobin, platelet count, albumin, alkaline phosphatase, creatinine, sodium, C-reactive protein, and prothrombin time). Factors such as cT and cN categories, type of resection (DG or total gastrectomy [TG]), esophagogastric junction cancer, presence of concurrent cholecystectomy, splenectomy and enterostomy, use of preoperative chemotherapy, and hospital case volume were also considered. Hospital case volume was determined based on the mean annual number of minimally invasive procedures, including RDG, RTG, LDG, and LTG, which were estimated using the database. Hospital case volume ≥ 20 cases/year was recognized as a high-to-middle volume in this study.

Statistical analysis

A biostatistician (H. Y.) conducted propensity score modeling and matching while being blinded to the outcome. The propensity score was estimated using logistic regression models built separately in the cohort of DG cases and that of TG cases, predicting the exposure of undergoing RG to LG from the confounding variables described above. Greedy nearest neighbor matching was performed using a caliper with 0.2 standard deviations of the logit of the estimated propensity score at a ratio of 1:1 without replacement using the PSMATCH2 program [23]. The balance of the matched cohort was assessed by calculating the standardized difference (SD) between the two groups using the STDDIFF program [24]. An absolute SD above 0.1 indicated a meaningful imbalance. We made comparisons of various outcomes between the matched cohort using McNemar’s test or the Stuart-Maxwell test for categorical variables and the Wilcoxon signed-rank test for continuous variables. A conditional logistic regression model was applied to estimate the odds ratio (OR) and 95% confidence interval (95% CI) of the primary outcome. Data are expressed as median (interquartile range) unless otherwise stated. All comparisons were two-sided, and a p value less than 0.05 was considered significant. All analyses were conducted using STATA 16 (STATA Corp., TX, USA).

Results

Patient demographics

A flow diagram of the patient selection process is shown in Fig. 1. During the study period, 10,722 patients who underwent RDG, LDG, RTG, or LTG were registered in the NCD. Of these, 841 patients were excluded, and the remaining 9881 patients, consisting of 2675 RGs and 7206 LGs, underwent the analyses. The background characteristics of the patients are summarized in Table 1. Patients who were treated by RG were younger, had a better activities of daily living score and less advanced disease, with fewer ASA-PS scores of 3–5. A greater proportion of patients treated with RG underwent DG (RG, 85.5% vs. LG, 79.8%), whereas a smaller proportion underwent concurrent surgical procedures (RG, 4.4% vs. LG, 11.8%). Moreover, RG was more likely to be performed in high-to-middle volume hospitals in relation to minimally invasive gastrectomy (RG, 98.1% vs. LG, 79.5%). After the propensity score matching, 2671 patients who underwent RG and 2671 who underwent LG were retrieved, and the SD of all these confounding factors was reduced to 0.07 or less (Table 1).

Fig. 1
figure 1

Flow diagram of the patient selection process. DG distal gastrectomy, TG total gastrectomy, ESSQS Endoscopic Surgical Skill Qualification System, RG robotic gastrectomy, LG laparoscopic gastrectomy, RDG robotic distal gastrectomy, RTG robotic total gastrectomy, LDG laparoscopic distal gastrectomy, LTG laparoscopic total gastrectomy

Table 1 Patient background

Surgical outcomes

Surgical outcomes are shown in Table 2. The operative time was significantly longer in RG (RG, 354 [295–426] min vs. LG, 268 [221–326] min; p < 0.001), whereas no differences were seen between RG and LG in open surgery conversion rate, incidence of intraoperative adverse events, estimated blood loss, massive intraoperative bleeding, intraoperative use of blood transfusion, intraoperative use of red blood cell transfusions, and surgical curativity.

Table 2 Surgical outcomes

Short-term outcomes after surgery

The postoperative short-term outcomes are shown in Table 3. The morbidity rate (C–D grade ≥ IIIa), which was the primary outcome of this study, did not differ between RG and LG (RG, 4.9% vs. LG, 3.9%; OR, 1.27; 95% CI 0.977–1.650; p = 0.084). No differences were observed in intra-abdominal infectious complications (RG, 5.0% vs. LG, 5.4%; p = 0.533), other local complications (RG, 4.1% vs. LG, 4.0%; p = 0.944), and systemic complications (RG, 3.7% vs. LG, 3.4%; p = 0.602). The reoperation rate was greater in RG (RG, 2.2% vs. LG, 1.2%; p = 0.004); however, the duration of postoperative hospitalization was shorter in RG (RG, 10 [8–13] days vs. LG, 11 [9–14] days; p < 0.001). There was no difference in intensive care unit stay duration, readmission rate, 30-day mortality, in-hospital mortality, and surgical mortality in this series. In addition, the parameters for each type of resection are summarized in Table 4.

Table 3 Postoperative short-term outcomes
Table 4 Postoperative short-term outcomes on distal gastrectomy and total gastrectomy (after matching)

Discussion

The present large database study clearly demonstrated the safe real-world penetration of RG under the Japanese universal health insurance program. In the NCD, detailed patient demographic data are recorded, including age, sex, tumor stage, comorbidities, clinical laboratory data, type of operation, and preoperative treatment; these were elaborately balanced between RG and LG groups using PSM in this study [11, 12, 18,19,20,21]. It has been reported that surgeon and hospital volume, as well as patient demographics, affect the outcomes of surgical treatment [20, 22]. Operating surgeons in both LG and RG groups were confined to those qualified by the JSES ESSQS to control for the surgeon volume; to be covered by insurance, RGs during this study period had to be performed by ESSQS-qualified surgeons, who have been regarded as highly skilled in laparoscopic surgery and thoroughly familiar with operative anatomy and LG procedure [7]. According to a previous study, ESSQS qualification is positively associated with LG case experience and proficiency in the LG procedure [25]. Moreover, experienced LG surgeons can overcome the learning curve of RG more rapidly [26,27,28]. Hospital volume was adjusted using PSM by categorizing annual minimally invasive gastrectomy (LG + RG) cases into above (high-to-middle volume) and below (low-volume) 20 because for RGs to be covered by insurance, the operating hospital must have performed more than 20 minimally invasive gastrectomies during the past year [5]. Consequently, most of the enrolled patients underwent RG or LG in the high-to-middle volume centers after PSM.

The primary goal of this study was to determine if the advantageous short-term outcome of RG over LG (reduction in morbidity) achieved in our previous multi-institutional prospective study (UMIN000015388) [4] was well reproduced in a real-world setting with good enumeration after the MHLW had recognized RG as a part of LG under the universal health insurance coverage. In this regard, the outcome of this study was almost reversed, although the difference was not statistically significant. However, the morbidity over C–D grade IIIa of 3.9% in LG is even better than those in the previous studies (UMIN000015388 [4], 6.4%; Shibasaki et al. [29], 7.6%; Guerrini’s meta-analysis [30], 6.4%; Shibasaki’s systematic review [31], 1.1–17.5%; JCOG0703 trial [3] LDG for cStage I GC, Common Terminology Criteria for Adverse Events [CTCAE] v3.0 Grade ≥ 3, 5.1%; JCOG1401 trial [32] laparoscopic proximal or total gastrectomy for cStage I GC, CTCAE v4.0 Grade ≥ 3, 29.1%), suggesting that those who underwent LG operated on by the ESSQS-qualified surgeons in the high-to-middle volume centers were managed very well. In contrast, the morbidity over C–D grade IIIa of 4.9% in RG is relatively higher than those previously reported (UMIN000015388 [4], 2.45%; Shibasaki et al. [29], 3.7%; Guerrini’s meta-analysis [30], 4.1%). This may be at least partly because the enrolled patients underwent RG within 1 year after RG was insured, and a considerable number of the operating surgeons may not have reached a learning plateau of RG, even though they were qualified by the ESSQS. However, the morbidity of 4.9% in RG is still better than those in LG shown in the aforementioned previous studies [3, 4, 29,30,31,32], suggesting that the requirements for insured RG may contribute to safe introduction of RG in general practice. The morbidity of RG is likely to become lower than that of LG as the skills for RG matures on a nationwide basis, since an increasing number of studies—including UMIN000015388, conducted mostly in the leading institutions for RG by expert surgeons in Japan—have successfully revealed favorable outcomes for RG [2, 4, 29, 33,34,35,36]. We believe that methodologies to boost the fundamental skills, surgical concepts, and technical principles up to levels comparable to those of expert RG surgeons are needed. To this end, the educational system or adequate clinical experience could play a key role in bringing out the clinical benefits of RG in a real-world setting.

In this study, the 30-day mortality of LG (LG, 0.1%; LDG, 0.0%; LTG, 0.8%), was better than reported in the other large database studies (US National Cancer Database [37], 2.7%; Japanese NCD [11, 18, 19], 0.2–0.9%); however, it did not differ from RG (RG, 0.2%; RDG, 0.2%; RTG, 0.5%). A similar trend was observed in the estimated blood loss and readmission rate. Although RG had increased reoperation rates, that of 2.2% in RG is considered acceptable, since those in each type of resection (LDG, 1.2%; RDG, 2.1%; LTG, 1.3%; RTG, 3.4%) were better than those of LG in the real world (LDG, 2.7% [19]; LTG, 4.3% [21]), which were previously determined using the NCD. These data also suggest that the requirements for insured RG may contribute to safe introduction of RG in general practice. The duration of postoperative hospitalization was even reduced in RG, irrespective of its higher reoperation rate and relatively higher morbidity rate. This may possibly attribute not only to potentially less invasive nature of RG compared to LG [34], but also to better control of the postoperative adverse events after RG. The post hoc evaluation demonstrated that mechanical bowel obstruction was the most common cause of reoperation, especially after RG. This may be at least partly because of port-site hernia after RG, in association with the use of 8-mm or 12-mm trocar rather than 5-mm trocar [38]. Routine closure of the 8-mm, as well as 12-mm, port sites may help reduce emergent repair due to small bowel incarceration. There was a trend toward increase in the incidence of anastomotic leakage or drainage requiring reoperation in RG, however, the incidence of anastomotic leakage was comparable between RG and LG. In the meantime, there was a trend toward increase in performing red blood cell transfusion after RG, however, the reoperation rate due to postoperative hemorrhage was comparable between RG and LG. These trends were more remarkable in TG rather than DG. Considering the fact that the number of TG was much smaller than that of DG in this study and that TG is technically more demanding than DG [2, 11, 21, 32], learning curve of RG may have affected the increase in severe anastomotic leakage as well as mild postoperative bleeding.

There are several limitations to the present study. First, this study was conducted retrospectively, and we were unable to discuss unmeasured outcomes. Second, long-term outcomes and cost-effectiveness could not be assessed, since prognosis, recurrence pattern, late complications, and cost were not documented in the NCD. Third, detailed PG data are also lacking in this database.

In conclusion, considering the fact that LG and RG are mature and growing treatment measures, respectively, RG was safely performed while meeting the requirements for insurance coverage.