Background

The surgical management of colorectal cancer (CRC) has evolved in the last decade with the introduction of novel surgical equipment, techniques, and the rapid expansion of robotic surgery [1]. General surgery has become the largest market for robotics with a 24-fold increase since 2010 [2]. Proponents of this new technology allege improved outcomes and safety for common procedures, such as colorectal resections. However, evidence on the benefit of adopting the robotic platform for CRC remains limited and may not reflect real-world practice. CRC remains one of the most common types of cancer and a primary contributor to the increase in cancer-related death worldwide [3]. Despite a decrease in the incidence and mortality of CRC among adults older than 50 years of age, we are observing an alarming increase in CRC among younger adults since the early 1990s [4,5,6]. These trends highlight the importance of optimizing surgical treatment strategies for CRC.

The national operative case log database of the ACGME for general surgery residents showed an increase in the proportion of minimally invasive surgery in colorectal cases from 8% in 2003 to 43% in 2018 [7]. This increase was accompanied by evidence supporting laparoscopic colorectal surgery as superior to open surgery, with faster recovery, less postoperative pain, shorter length hospital stay, and comparable oncologic outcomes [8,9,10]. Recently, robotic surgical systems were introduced to overcome certain limitations of laparoscopy by offering better 3D visualization, a stable camera, bimanual dexterity, tremor reduction, and improved ergonomics [11]. Therefore, robotic-assisted colorectal surgery has garnered wide acceptance despite the lack of convincing evidence on its advantages over laparoscopy [12,13,14,15,16,17,18]. Most studies addressing this comparison are based on single-institutional data, small sample size, or a heterogeneous patient cohort without appropriate control populations [19, 20].

To address this knowledge gap, we conducted a retrospective cohort study evaluating the perioperative outcomes of robotic and laparoscopic surgery for CRC in a propensity score-matched analysis. Using the colectomy-targeted American College of Surgeons-National Surgical Quality Improvement Program (ACS-NSQIP) database, we compared robotic and laparoscopic right colectomy (RC), left colectomy (LC), and low anterior resection (LAR). If robotic colorectal surgery offers an advantage over laparoscopy, we hypothesized that perioperative outcomes would be more favorable after robotic assisted surgery.

Methods

Data source

The ACS-NSQIP is a nationally validated, risk-adjusted, outcomes-based program used to track and refine surgical care based on 30-day patient outcomes. This program collects data on more than 250 variables, including demographics, preoperative risk factors, intraoperative variables, and 30-day postoperative morbidity and mortality. To ensure the highest quality standards, data are collected and maintained by a dedicated surgical clinical reviewer at each participating institution. The ACS-NSQIP also includes rigorous data field definitions with ongoing review, conducts frequent audits of participating sites, and requires annual certification exams for surgical clinical reviewers [21]. Using the unique “CASEID” variable, we merged the main NSQIP to the colectomy-targeted participant user data file containing 23 additional variables specific to colorectal operations. This study was reviewed by the University of Texas Southwestern Human Research Protection Program and deemed exempt from IRB approval or oversight.

Study design and population

This is a retrospective cohort study using the ACS-NSQIP database from 2015 to 2020. Patients were identified using to the current procedural terminology (CPT) codes for colorectal procedures. Elective robotic or laparoscopic resections with anastomosis for CRC were included. In an effort to homogenize the study population, we serially excluded cases with disseminated cancer, ascites, preoperative sepsis, ASA-5, ventilator dependence, and concurrent major procedures such as hepatectomy or pancreatectomy. Cases were stratified according to the location of the colon or rectal resection: right-sided colectomy (CPT codes 44160, 44205), left-sided colectomy (44140, 44204), or low anterior resection (44207, 44208, 44145, 44146). The data for each of these three groups are presented separately. Using the “COL_APPROACH” variable, patients were divided into robotic or laparoscopic groups. Patients who had an unplanned conversion to open surgery remained in their original group (intention-to-treat). Lastly, we performed a subgroup analysis on patients undergoing LAR evaluating those who underwent a diverting loop ileostomy (CPT 44208, 44146), and those who did not (CPT 44207, 44145). Figure 1 depicts the study flow diagram with inclusion and exclusion criteria. This study was reported in accordance with the “Strengthening the Reporting of Observational Studies in Epidemiology” (STROBE) 2021 guidelines [22].

Fig. 1
figure 1

Study flow diagram showing inclusion and exclusion criteria

Outcomes evaluated

We compared baseline preoperative characteristics of patients undergoing robotic and laparoscopic colorectal resections such as age, gender, race, body mass index (BMI), ASA class, and bowel preparation (mechanical and antibiotic). We assessed intraoperative outcomes: number of lymph nodes harvested, unplanned conversion to open, and operative time. Finally, we evaluated postoperative outcomes: length of hospital stay (LOS), textbook outcome, anastomotic leak, postoperative ileus, 30-day readmissions, complications, and mortality.

Relying on a single outcome with low event rates may not accurately reflect the perioperative course, thereby creating a need for a multidimensional indicator to incite improvement in quality of care. Textbook outcome (TO) is a novel surgical quality assessment tool that combines structure, process, and surgical outcome. It is a simple, useful, and reliable measure that has been validated in different surgical specialties, showing adequate discriminant validity [23, 24]. This composite quality metric incorporates several parameters, and many aspects of morbidity (complications, LOS, interventions, and readmission) to accurately reflect the perioperative course and most desirable outcome [25]. We defined TO as a length of hospital stay less than 5 days (75th percentile) and the absence of 30-day complications, readmission, or mortality. In line with previous literature adapting the Clavien-Dindo classification to the ACS-NSQIP, major morbidity was defined as any of the complications listed in Appendix Table 6 [26, 27].

Statistical analysis

Statistical analyses were performed using R statistical software and the IBM SPSS statistical package (Version 28). After defining two treatment groups as robotic and laparoscopic, we performed propensity score matching (PSM) using the “MatchIt” and “optmatch” packages in R. We estimated the conditional probability of undergoing a robotic colorectal resection (the propensity score) using a multivariable logistic regression model. Next, we created balanced cohorts using 2-to-1 (laparoscopic to robotic) optimal pair matching for RC and LC, and 1-to-1 for LAR due to the higher number of robotic LARs. The choice of covariates included in the PSM was done according to the recommendations provided by Kainz et al. [28]. We also included covariates that were statistically significant on multivariate analysis. The PSM was done without replacement and with a “tol” argument of 10–8 dictating the numerical tolerance that determines when the optimal solution is found. Using standardized mean differences (SMD), we conducted balance diagnostics with SMD < 0.1 indicating a good balance and implying a negligible difference between treatment groups. Continuous variables with normal distribution are presented as mean and standard deviation (SD), while those with non-normal distributions are presented as median and interquartile range [IQR]. In the unmatched cohorts, we compared the baseline demographic and pathologic characteristics between the two groups with a chi-squared test for categorical variables. In the matched cohorts, considering the paired nature of the data, we used a McNemar test or McNemar-Bowker test for categorical variables and a Wilcoxon signed-rank test for continuous variables. Two-sided p values are reported. An α < 0.05 was considered statistically significant for all hypothesis testing.

Results

Patient characteristics and propensity score matching

We identified 234,304 patients in the colectomy targeted ACS-NSQIP (2015–2020). After screening for eligibility, 53,209 patients were included in the analysis: 16,982 had a RC, 19,201 LC, and 17,026 LAR. Figure 1 illustrates the distribution and matching results of patients stratified according to the location of their colorectal resection. Characteristics of patients included in the study cohort are described before and after matching in Tables 1, 2, and 3. For each of the three groups, the distribution of baseline covariates was adequately balanced in the matched data sets with the largest SMD = 0.048, implying a negligible discrepancy between treatment groups. Density plots of the matched data sets (Fig. 2) are nearly indistinguishable, implying a good balance of covariates based on the estimated propensity score. Figure 3 depicts the trends in the surgical approach of colorectal cancer (CRC) during our study period in patients from the ACS-NSQIP (2015–2020).

Table 1 Demographics and pathologic characteristics of patients undergoing right colectomy before and after propensity score matching
Table 2 Demographics and pathologic characteristics of patients undergoing left colectomy before and after propensity score matching
Table 3 Demographics and pathologic characteristics of patients undergoing low anterior resection before and after propensity score matching
Fig. 2
figure 2

Density plots of propensity scores before and after optimal pair matching

Fig. 3
figure 3

Trends in the surgical approach of colorectal cancer (CRC) in patients from the ACS-NSQIP (2015–2020)

Right and left colonic resections

Tables 1 and 2 illustrate the characteristics of patients undergoing RC and LC, respectively. Each robotic case was matched to two laparoscopic. Before matching, most variables had a statistically significant difference (p < 0.05) between the two groups. Subsequently, after performing the PSM, all variables were homogenously balanced (SMD < 0.1). Baseline demographics of the unmatched cohorts revealed that patients undergoing robotic RC and LC for CRC were more likely to be young, white, obese, and receive mechanical or antibiotic bowel prep (Tables 1 and 2). For all perioperative outcomes, we evaluated the 2:1 laparoscopic to robotic matched data sets (Table 4). Figure 4 illustrates the perioperative outcomes of robotic surgery for CRC compared to laparoscopy.

Table 4 Perioperative outcomes of right and left colectomy after propensity score matching
Fig. 4
figure 4

Summary of perioperative outcomes of right colectomy, left colectomy, and low anterior resection after propensity score matching

All results are reported as robotic vs. laparoscopic unless otherwise specified. When addressing intraoperative outcomes, the median operative time was longer in robotic compared to laparoscopic resections (183 vs. 134 min for RC, 202 vs. 154 for LC; p < 0.001). The average number of lymph nodes (LN) harvested during the operation, as documented in the pathology report, was higher in the robotic group (23.84 vs. 22.57 LN for RC, 21.70 vs. 21.03 for LC; p < 0.001). Robotic resection was associated with a lower conversion rate compared to laparoscopy (4.1% vs. 8.5% for RC, 5.2% vs. 8.8% for LC; p < 0.001). Finally, the number of bleeding transfusion occurrences within 72 h of operative start time was similar in the two groups (8.0% vs. 8.3% for RC; p = 0.57 and 5.7% vs. 6.1% for LC; p = 0.39).

When comparing postoperative outcomes, robotic and laparoscopic resections have comparable rates of anastomotic leak (1.9% vs. 1.8% for RC; p = 1 and 2.1% vs. 1.8% for LC; p = 0.301). The rate of postoperative ileus was significantly lower only in robotic RC (9.0% vs. 11.6%; p < 0.001), while it was comparable for both surgical approaches in LC (7.9% vs. 8.6%; p = 0.170). Both operative approaches had comparable overall complication rates (15.9% vs. 16.6% for RC; p = 0.43 and 12.7% vs. 13.9% for LC; p = 0.055), major morbidity (6.5% vs. 6.3% for RC; p = 0.64 and 5.5% vs. 5.7% for LC; p = 0.67), and 30-day mortality (0.7% vs. 1% for RC; p = 0.17 and 0.9% vs. 0.8% for LC; p = 0.38) (Table 4). Finally, robotic RC and LC were associated with a higher rate of textbook outcomes compared to laparoscopy (71.0% vs. 64.0% for RC and 74.6% vs. 68.1% for LC; p < 0.001). The apparent significant difference in textbook outcomes was driven by the shorter LOS and the lower rate of any complications for both RC and LC. Although complication rates are not lower in the robotic group on univariate analysis, they are contributing to the higher rates of TO.

Low anterior resection

A total of 4854 patients undergoing robotic LAR were matched 1:1 to laparoscopic cases. Characteristics of the patient cohort undergoing LAR for CRC are illustrated in Table 3. Baseline demographics of the unmatched cohort revealed that patients undergoing robotic LAR were more likely to be young, white, obese, and receive mechanical or antibiotic bowel prep (Table 3). For all perioperative outcomes discussed below, we evaluated the 1:1 laparoscopic to robotic propensity score-matched data set (Table 5).

Table 5 Perioperative outcomes of low anterior resection after propensity score matching

When comparing intraoperative outcomes, the median operative time was longer in robotic LAR (246 vs. 201 min; p < 0.001). The average number of lymph nodes (LN) harvested was comparable in the two groups (20.14 vs. 20.28 LN; p = 0.744). Robotic resection was associated with a lower conversion rate (3.9% vs. 10.4%; p < 0.001). Finally, the number of bleeding transfusion occurrences was similar in the two groups (2.5% vs. 2.9%; p = 0.189).

When comparing postoperative outcomes, the robotic approach was associated with a higher rate of anastomotic leak compared to laparoscopy (3.4% vs. 2.4%; p = 0.005). Similarly, the rate of postoperative ileus was significantly higher in robotic LAR (11.9% vs. 10.5%; p = 0.032). Both operative approaches had comparable overall complication rates (12.2% vs. 11.9%; p = 0.754), but robotic LAR was associated with a higher rate of major morbidity (7.1% vs 5.8%; p = 0.012). Finally, the two surgical approaches had comparable rates of textbook outcomes (68% vs 67%; p = 0.297) and 30-day mortality (0.4% vs 0.4%; p = 0.871; Table 5).

Similarly, a subgroup analysis comparing patients undergoing LAR without a diverting loop ileostomy showed a higher rate of anastomotic leaks, major morbidity, and readmission with the robotic approach (Appendix Table 7). However, when evaluating patients undergoing LAR with a diverting loop ileostomy, both the robotic and laparoscopic approaches had a comparable rate of perioperative morbidity (Appendix Table 8).

Discussion

In the USA, colorectal resections are among the most commonly performed surgical procedures and robotic surgery is being increasingly adopted in the management of CRC. Evidence supporting this transition from traditional laparoscopy has not been sufficient. To the best of our knowledge, this study is the largest retrospective propensity score-matched analysis comparing perioperative outcomes of robotic and laparoscopic resections for CRC. Our results suggest an advantage for the robotic approach in RC and LC by increasing the rate of textbook outcomes, decreasing conversion rate, and comparable morbidity and mortality. Conversely, robotic LAR was associated with a similar rate of TO compared to laparoscopy and an increased rate of postoperative ileus, anastomotic leak, and major morbidity.

In recent years, an increasing number of studies investigated the perioperative outcomes of minimally invasive surgery using the NSQIP database [29, 30]. El Aziz et al. report a comparative study highlighting the increased adoption of robotic colorectal surgery and its implications on perioperative outcomes [29]. They compared open, laparoscopic, and robotic colectomies performed for any etiology combining left, right, and low anterior resections. Similarly, a recent study by Soliman et al. also compared the two approaches for CRC and chronic diverticulitis using the NSQIP database [31]. Although some of the endpoints examined in these two studies are identical to our outcomes, we believe that stratifying resections by their location and performing a propensity score-matched analysis extend a deeper understanding of the data, and may uncover new insights that traditional statistical approaches cannot. When compared to rectal resections, RC and LC have fundamentally different technical and perioperative considerations; thus, it is essential to investigate each of these populations separately. Additionally, although the NSQIP provides colectomy data for various etiologies, our study compared the two surgical approaches in the management of CRC only.

A systematic review and meta-analysis by Tschann et al. showed more favorable perioperative outcomes with robotic RC compared to laparoscopy such as a lower rate of blood loss, lower conversion rate, and shorter LOS [32]. Another systematic review and meta-analysis by Solaini et al. concluded that robotic RC is non-inferior to laparoscopy in terms of postoperative complications and mortality [33]. Our study analogously demonstrates several advantages of robotic RC such as a higher rate of textbook outcomes, shorter LOS, lower conversion rate, and less postoperative ileus. With only one randomized controlled trial included, the main limitation of these two systematic reviews was that most included studies were retrospective, potentially contributing to a selection bias. Although our study is also retrospective, we performed a PSM analysis to mitigate the impact of selection bias.

In a recent systematic review and meta-analysis of patients undergoing LC, Solaini et al. concluded that the robotic approach is associated with lower postoperative complications and morbidity [15]. Interestingly, their results were not confirmed in the subgroup analysis done for malignant etiologies. Operative time was longer in the robotic group, while conversion rate was lower. This is in line with the findings of our study, which demonstrated comparable perioperative complication and mortality rates in LC for CRC. Our study extends a deeper understanding of this comparison and highlights the increased rate of textbook outcomes with robotic LC. We postulate that the robotic approach may be improving outcomes in RC and LC due to better 3D visualization, greater degrees-of-freedom, and the ability to precisely perform complex maneuvers in narrow anatomical spaces compared to laparoscopy.

Robotic LAR is a more complex and intricate procedure compared to RC and LC with an estimated learning curve of 55–65 cases, compared to 35–45 for LC, and 16–25 for RC [34, 35]. Using a national clinical database, Matsuyama et al. recently compared the perioperative outcomes of robotic and laparoscopic LAR in a propensity score-matched analysis in patients with rectal cancer [36]. They showed improved perioperative outcomes with robotic LAR such as a lower conversion rate, a shorter LOS, comparable complication rates, and a lower mortality rate. Furthermore, a meta-analysis by Sun et al. also showed a shorter LOS, lower conversion rate, and lower overall complication rate with robotic LAR compared to laparoscopy [37]. Although our study demonstrated a shorter LOS after robotic LAR and lower a conversion rate, it challenges some of the findings proposed by the two aforementioned studies. Our results showed a higher rate of severe complications and an increase in leak rates, postoperative ileus, and 30-day readmission after robotic LAR compared to laparoscopy. The higher leak rates with robotic LAR may be due to the long learning curve of this procedure or due to a selection bias for lower-lying rectal tumors being done robotically to use the advantages of this technology in the narrow pelvis. Contrary to our initial hypothesis, the results of this study did not substantiate our expected outcomes for low anterior resection. Instead, the data suggest a higher rate of postoperative morbidity with robotic LAR and no significant difference in the rate of textbook outcomes. This can be partially attributed to the fact that TO only considers overall (any) complications and not severe complications. Additionally, although the LOS demonstrated a statistically significant advantage in favor of the robotic approach, the actual difference was only 0.1 days, which may not have a clinically relevant impact.

In the ROLARR randomized controlled trial, Jayne et al. compared the conversion rate of robotic and laparoscopic rectal resection in 471 patients between 2011 and 2014 [38]. They reported a conversion rate of 8.1% for robotic and 12.2% for laparoscopic, but this difference was not statistically significant. Interestingly, a sensitivity analysis exploring learning effects suggested a potentially lower robotic conversion rate when performed by surgeons with substantial prior robotic experience. Our study indicates that the conversion rate for laparoscopic LAR was 10.4% during the study period which was significantly higher that the robotic conversion rate (3.9%). In a recent multicenter trial, Feng et al. reported better postoperative recovery with the robotic approach for middle and low rectal cancer (REAL trial) [39]. The strength of this trial lies in the selection of middle and low rectal cancer cases which are theoretically the narrow anatomical spaces where robotic surgery is expected to confer an advantage over laparoscopy. In our current study, the LAR group included low, middle, and high rectal cancer because the ACS-NSQIP does not provide data on the distance of the tumor from the anal verge to stratify them.

Our study has several important limitations that need to be addressed. First, despite being one of the best available tools for quality improvement in surgery, the NSQIP database carries inherent constraints. Errors in classification, coding, or reporting of patient information may affect the quality of the data. Additionally, the NSQIP only collects data from around 850 or 14% of all US hospitals, further increasing the risk of selection bias towards more developed and higher performing centers. However, the large sample size generated from this national database allowed for a robust statistical analysis, which increases the accuracy of results, particularly when comparing procedures with small, expected differences. Second, this was a retrospective cohort study which carries a risk of selection bias. Even after implementing a PSM, residual selection bias from unmeasured/unknown confounders cannot be excluded in the absence of randomization. Third, the NSQIP does not provide data on neoadjuvant radiotherapy for CRC, an already established risk factor for postoperative complication. Fourth, there was no consideration of surgeon expertise level, and the nuanced variations of case complexity were not captured and accounted for by the included variables. The database lacks granular data allowing us to stratify participating institutions into high-volume/low-volume centers, and it lacks any information on the distance of rectal tumors from the anal verge which contributes to the level of complexity of the case. Additionally, the expertise level of the surgeon performing the operation is unknown and their experience with either laparoscopic or robotic colectomy is not clearly defined. RC, LC, and LAR have different learning curves that must be evaluated in a multidimensional approach when comparing robotic and laparoscopic surgery [35]. Despite being a limitation of our study, the suspected heterogeneous expertise levels between different contributing centers reflects the current real-world practice, thus enhancing the external validity and generalizability of this study. Additionally, the NSQIP does not report technical aspects of the procedure such as an intracorporeal vs extracorporeal anastomosis, or the extent of lymphadenectomy (D2 vs D3), which are known to affect OR time and other perioperative outcomes. It should also be acknowledged that differences in short-term outcomes such as LOS may be mediated by differences in postoperative care pathways. Finally, the short follow-up period reported by the ACS-NSQIP (30 days post-op) limits our ability to assess long-term oncologic and survival outcomes.

Conclusions

In this retrospective cohort study, robotic right and left colectomy for CRC showed an increase in textbook outcomes with a comparable morbidity and mortality compared to laparoscopy. Conversely, albeit limited by several possible confounders, low anterior resection showed increased rates of anastomotic leak, postoperative ileus, major morbidity, and a comparable rate of textbook outcomes. As robotic colorectal surgery comes with an increased fiscal burden, the enthusiasm accompanying it should not outpace the evidence needed to support its expansion. The associations highlighted in our study should be considered in the surgical planning for patients with colorectal cancer.