Introduction

Minimally invasive surgery (MIS), including laparoscopic radical hysterectomy (LRH) and robotic-assisted radical hysterectomy (RRH), had long been recognized as an alternative surgical approach to abdominal radical hysterectomy (ARH) with reduced operative morbidity and similar oncological safety until 2018 [1, 2]. Either the preliminary data that drew an early end to a randomized controlled trial (the laparoscopic approach to cervical cancer, LACC) or the result of a large-scale observational study revealed inferiority of overall survival (OS), disease-free survival (DFS) and progression-free survival (PFS) in patients undergoing MIS compared to ARH. These results shattered the long standing consensus of preference of MIS as primary treatment for cervical cancer and the clinical practice guideline in cervical cancer changed accordingly [3].

Before LACC, most studies compared MIS versus ARH in cervical cancer reported that MIS showed better short-term outcomes and equivalent 5-year survival compared to ARH [4,5,6,7,8], hence three meta-analyses based on data before LACC reporting that there was no difference of risk of recurrence or death between patient underwent MIS and ARH [9, 10]. After LACC, evidence implying inferiority of MIS for managing cervical cancer sprouted and mounted [11,12,13,14].Three recent meta-analyses evaluated the issue on basis of different inclusion criteria and ended up with different conclusions: Tanitra et al. included five studies published before 2018 and suggested no difference of PFS or OS between MIS and ARH; Nitecki et al. identified 49 studies and included 15 high-quality studies in their meta-analysis comparing MIS and ARH in patients with FIGO IA1 to IIA (FIGO 2009 staging) cervical cancer suggesting that MIS was associated with increased risks of both recurrence and death; Hwang et al. included 36 studies comparing DFS of patients undergoing LRH and ARH suggested that LRH was associated with higher risk of recurrence in patients with tumor size larger than 2 cm [15,16,17].

MIS revealed superiority of survival outcomes in prostate, colon and rectum cancers and equivalent outcomes in endometrial cancer according to the Gynecologic Oncology Group Study LAP2 trial and the laparoscopic approach to cancer of the endometrium LACE trial [18,19,20]. Researchers proposed several hypothetical factors that might lead to poor performance of MIS in cervical cancer such as uterine manipulator, CO2 pneumoperitoneum, learning curve, hospital volume, technique of surgeons and tumor size in patients [21, 22]. Therefore, we aim to evaluate the oncological safety of MIS in cervical cancer patients stratified by characteristics of disease (FIGO stage and tumor size), publication (publication time and journal), study design (single-center or multi-center) and treatment center (average sample size per center) and to identify possible factors that led to the controversies of MIS among previous studies.

Method

Literature search

The literature search was conducted in Medline, Embase, Pubmed, Cochrane library and Web of Science from January 2000 to April 2021 without limitation of text availability, article type or language using the following terms: “open”, “abdominal”, “laparotomy”, “laparoscopic”, “minimally invasive”, “robotic assisted”, “radical hysterectomy”, “surgery”, “cervical cancer”, “cervical carcinoma” and “carcinoma of the cervix”. Additional manual search was performed by scanning the references of all included and relevant studies.

Study selection and quality assessment

Two authors screened the titles and abstracts for potentially related articles, which were further reviewed for eligibility by reference to the inclusion criteria as follows: (1) cervical cancer patients treated with minimally invasive or abdominal radical hysterectomy; (2) studies with at least two arms that compared OS, PFS or DFS; (3) patients that received surgery as primary treatment. Studies were excluded when (1) studies were published as comment, conference abstract and letter; (2) total number of patients less than 40 or at least one arm is less than 20; (3) studies did not provide sufficient data to estimate the hazard ratio (HR) and 95% confidence interval (CI) of OS, PFS or DFS between MIS and ARH; (4) patients that received radical trachelectomy or laparoscopic assisted radical vaginal hysterectomy; (5) patients received neoadjuvant radiotherapy or chemotherapy. When population overlap existed between studies, only the most recent published study with bigger population was included. Quality assessment was conducted using the Newcastle–Ottawa Scale (NOS) for assessing the quality of nonrandomized studies in meta-analysis (Supplemental file 1).

Data extraction and subgroup classification

Two authors extracted the following data: name of first author, year of publication, journal of publication, region of journal, country and region where the studies were conducted, data source, number of centers, time span of enrollment, surgical approach, study type, cohort matching status, technique level of the surgeon, FIGO stage, histology, tumor size, lymphatic metastasis, adjuvant therapy, sample size before and after propensity score matching, HRs and 95% CIs of OS, PFS or DFS. HRs were estimated according to Tierney et al. if not reported [23]. The extracted data were validated by a third author. Since all included studies uniformly follow the 2009 FIGO staging criteria, the FIGO classification used in this study still represented the old nomenclature.

The subgroup classification criteria were as follows: (1) FIGO stage and tumor size, studies reporting patients with FIGO stage IB1 cervical cancer were classified into tumor size < 2 cm or ≥ 2 cm subgroup; (2) year of publication, studies were classified into published before or after the LACC trial subgroup; (3) region of journal, studies were classified according to the region of journal the studies were published; (4) number of centers, studies were classified into the single-center or the multi-center group; (5) surgical approach, studies were classified into the LRH vs. ARH, the RRH vs. ARH or the MIS vs. ARH subgroup; (6) region of center, studies were classified into different regional subgroups according to the geographical continental location of where the surgeries were conducted; (7) sample volume, sample volume referred to annual number of radical hysterectomies conducted by all means per center reported by each study, which was estimated by number of patients that received radical hysterectomy by all means before propensity scored matching dividing number of centers then dividing number of years of recruitment. Studies were classified into high sample volume group or low sample volume group by the cur-off of median value; (8) MIS sample volume, MIS sample volume referred to annual number of radical hysterectomies conducted by MIS per center reported by each study, which was estimated by number of patients that received radical hysterectomy by MIS before propensity scored matching dividing number of centers then dividing number of years of recruitment. Studies were classified into high MIS sample volume and low MIS sample volume group by the cur-off of median value.

Statistical analysis

Random-effect model was used for all analyses despite heterogeneity [24]. Adjusted HRs and HRs after propensity-scored matching were used for pooled analysis when applicable. Sample size before propensity-scored matching was used as the weight variance during meta-analysis. Heterogeneity of the included studies was assessed by I2 and p value according to Higgins et al. and was classified as small to modest (I2 < 50%) and high (I2 ≥ 50%) [25]. Publication bias was assessed by funnel plot and eager’s test. A 95% CI of HR not overlapping with 1 and a p value < 0.05 (two sided) were considered of statistical significance. All analyses were performed using STATA14 (MP-Parallel Edition, College Station, TX 77845 USA).

Results

Study characteristics

A total of 2770 citations were identified by electronic search (2671) and additional manual search (99) after removal of duplicates. 2541 were excluded by review of title and abstract. Full texts of the remaining 229 items were retrieved, of which 30 studies were reviews and comments, 60 studies compared the feasibility of different surgical plans, 40 focused on surgical complications, 32 studies did not provide sufficient data and 6 studies with smaller cohorts contained overlapping population. Finally, 61 eligible studies with 63,369 patients (MIS 26956, ARH 36049) were identified (Fig. 1). Basic characteristics of included studies were presented in Table 1. HRs and 95% CIs of DFS/PFS and OS were extracted and pooled from 58 studies (MIS 17092, ARH 14584) and 47 studies (MIS 17979, ARH 15493), respectively [1, 2, 4,5,6,7,8, 11,12,13,14, 18, 22, 26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73]. Comprehensive original data were shown in Supplemental file 2.

Fig. 1
figure 1

Flow chart of study search and selection

Table 1 Basic characteristics of included studies

Meta-analyses of MIS versus ARH

Fifty-eight studies were included for meta-analysis of DFS/PFS comparing MIS to ARH. Total number of patient before propensity scored matching was 50,606 (MIS 20550 and ARH 29951) and 31,676 patients (MIS 17092 and ARH 14584) after matching were included for meta-analysis. The overall analysis revealed that patients who received MIS had higher risk of recurrence than patients that received ARH (HR 1.209; 95% C: 1.102–1.327). However, stratified analyses showed comparable PFS/DFS between the MIS and ARH patients in studies published before the LACC trial (HR 0.906; 95% CI 0.667–1.231), published in European Journals (HR, 0.919; 95% CI 0.717–1.178), conducted in a single center (HR 0.929; 95% CI 0.736–1.173), performed in centers in Europe (HR 1.027; 95% CI 0.789–1.338) or with high MIS sample volume (HR 1.028; 95% CI 0.857–1.234) (Table 2).

Table 2 Sub-group analyses of all studies comparing disease-free survival/progression-free survival between patients undergoing MIS and ARH

Forty-eight studies were included for meta-analysis of OS comparing MIS to ARH. Total number of patient before propensity scored matching was 59,212 (MIS 25347, ARH 33865) and 39,809 patients (MIS 21145 and ARH 18664) after matching were included for meta-analysis. Patients that received MIS had higher risk of death than patients who received ARH (HR 1.124; 95% CI 1.013–1.248). However, there were comparable OS between patients underwent MIS and ARH in studies published before the LACC trial (HR 0.857; 95% CI 0.628–1.169), published in European journals (HR 1.211; 95% CI 0.922–1.589), conducted in a single center (HR 1.021; 95% CI 0.772–1.352), performed in centers in Europe (HR 1.016; 95% CI 0.710–1.452) or Asia (HR, 1.028; 95% CI 0.906–1.166) and with a high MIS sample volume (HR 1.016; 95% CI 0.843–1.225). (Table 3).

Table 3 Sub-group analyses of all studies comparing overall survival between patients undergoing MIS and ARH

Meta-analyses of LRH versus ARH and RRH versus ARH

Thirty-four studies were included for meta-analysis of DFS/PFS comparing LRH to ARH. Total number of patient before propensity scored matching was 26,886 (MIS 11270 and ARH 15252) and 17,778 patients (MIS 9284 and ARH 8494) after matching were included for meta-analysis. Overall, patients that received LRH had higher risk of recurrence than patients who received ARH (HR 1.277; 95% CI 1.143–1.426). However, stratified analyses showed comparable PFS/DFS between patients that underwent MIS and ARH in studies published before the LACC trial (HR 0.858; 95% CI 0.668–1.103), published in European (HR 0.858; 95% CI 0.640–1.151) and Asian (HR 1.105; 95% CI 0.856–1.426) journals, with a single-center study design (HR 0.996; 95% CI 0.768–1.291), conducted in centers in Europe (HR 1.226; 95% CI 0.914–1.643) or in centers with high MIS sample volume (HR 0.971; 95% CI 0.790–1.194) (Table 4).

Table 4 Sub-group analysis of studies comparing disease-free/progression-free survival between patients undergoing LRH and ARH

Twenty-five studies were included for meta-analysis of OS comparing LRH to ARH. The total number of patients before propensity scored matching was 30,063 (MIS 13245 and ARH 10633), and 21,945 (MIS 11312 and ARH 10633) after matching were included for meta-analysis. Overall, there was no difference of OS between patients that underwent LRH and ARH. However, LRH was associated with a poor OS in studies published in American journals (HR 1.258; 95% CI 1.023–1.547) and conducted in centers with a low MIS sample volume (HR 1.249; 95% CI 1.007–1.550), but with a better OS in studies published in Asian journals (HR 0.718; 95% CI 0.587–0.877) (Table 5).

Table 5 Sub-group analysis of studies comparing overall survival between patients undergoing LRH and ARH

Thirteen studies were included for meta-analysis of PFS/DFS comparing RRH to ARH. The total number of patient before propensity scored matching was 14,044 (RRH 3103 and ARH 10941), and 5084 (MIS 2534 and ARH 2550) after matching were included for meta-analysis. Patients that received RRH had higher risk of death than patients who received ARH (HR 1.303; 95% CI 1.130–1.503). However, there were comparable OS between patients that underwent MIS and ARH in studies conducted in a single center (HR 1.516; 95% CI 0.970–2.369) or in Europe (HR 1.376; 95% CI 0.940–2.014) (Table 6).

Table 6 Sub-group analysis of studies comparing disease-free/progression-free survival between patients undergoing RRH and ARH

Eleven studies with 4470 patients (RRH 2217, ARH 2253) were included to evaluate OS of patients that received RRH. There was no difference of OS between RRH and ARH.

Meta-analyses of MIS versus ARH in patients with FIGO IB1 cervical cancer

Seventeen studies with 13,944 patients after matching (MIS 7168, 6776) specifically compared patients with FIGO IB1 cervical cancer underwent MIS and ARH. Overall, MIS was associated with increased risk of recurrence and progression in patients with FIGO IB1 cervical cancer (HR 1.515; 95% CI 1.271–1.805). Subgroup analyses showed comparable OS between patients that underwent MIS and ARH in studies with a single-center design (HR 1.558; 95% CI 0.911–2.664), conducted in Europe (HR 1.241; 95% CI 0.891–1.728) and with high sample volume (HR 1.254; 95% CI 0.862–1.824) or high MIS sample volume (HR 1.264; 95% CI 0.876–1.824) (Table 7). Moreover, MIS was correlated with increased risk of recurrence in patients with tumor size ≥ 2 cm (HR 1.787; 95% CI 1.396–2.286) but not < 2 cm (HR 1.257; 95% CI 0.884–1.789). While in 8375 FIGO IB1 patients with tumor < 2 cm (MIS 4333, ARH 4042), although overall meta-analysis showed comparable PFS/DFS between patients that underwent MIS and ARH, MIS was correlated with increased risk of recurrence in studies conducted in Asia (HR 1.398; 95% CI 1.061–1.843) and in studies with low sample volume (HR 1.552; 95% CI 1.190–2.024) or low MIS sample volume (HR 1.527; 95% CI 1.183–1.969) (Table 8).

Table 7 Sub-group analysis of studies comparing disease-free survival/progression-free survival between FIGO IB1 patients undergoing MIS and ARH
Table 8 Sub-group analysis of studies comparing disease-free survival/progression-free survival between FIGO IB1 patients with tumor size < 2 cm undergoing MIS and ARH

Publication bias

Publication bias was first evaluated by visual inspection of funnel plots and then Egger’s test. Visual inspection of funnel plots for studies comparing DFS/PFS and OS of patients undergoing MIS and ARH showed a slight asymmetry (Figs. 2 and 3). The results of Egger’s tests suggested that there was no publication bias for pooled DFS/PFS (p = 0.074) and OS (p = 0.052). Sensitivity analysis was performed by sequentially trimming and adding each included study. The results remained unchanged.

Fig. 2
figure 2

Funnel plots for studies comparing disease-free survival/progression-free survival

Fig. 3
figure 3

Funnel plots for studies comparing overall survival

Discussion

Overall, compared to ARH, MIS was associated with increased risk of disease progression or recurrence and increased risk of death in women with early stage cervical cancer. Comparable oncological outcomes between patients that received MIS and ARH was found in the meta-analysis in FIGO IB1 patients with tumor size less than 2 cm and in studies published before the LACC trial, published in European journals, conducted in a single center, performed in centers in Europe or with a high MIS sample volume, while the inferiority of MIS was found in the meta-analysis of studies published after the LACC trial, with a multi-center study design, conducted in Asia and America, or in centers with a low MIS sample volume. These findings delineate the complexity of the factors impacting MIS outcomes reported in published studies and may trigger rethinking about the surgical approaches for radical hysterectomy in early stage cervical cancers.

We found comparable oncological safety in patients undergoing MIS compared with ARH in studies published before the LACC trial but inferiority of MIS in studies published after the LACC trial, which was consistent with previous meta-analyses [9, 10, 15,16,17]. Additionally, studies published in American journals showed a poor PFS in patients that received MIS while studies published in European and Asian journals showed comparable PFS between patients undergoing MIS and ARH. We assumed that different characteristics of publication such as year and journal of publication might be a reason that led to the heterogeneous results in this study. The comparison between MIS to ARH also revealed magnificent geographical difference. Namely, studies in Asia reported that both LRH and RRH were associated with increased risk of recurrence and progression in patients with cervical cancer, which was consistent with the results reported by Hwang et al. [17]. Studies in America reported poor PFS in patients that received MIS except those with FIGO IB1 disease while studies in Europe reported comparable DFS/PFS between MIS and ARH. Regional subgroup analyses revealed high consistency in Europe but marked heterogeneity in Asia and America. Based on the present study, we did not find any evidence opposing MIS as an alternative choice of ARH in Europe. However, the results of our study as well as most previous meta-analyses were based primarily on non-randomized studies and should be, therefore, interpreted as generating hypotheses.

Comparable oncological safeties of MIS vs. ARH and LRH vs. ARH were observed in centers with a high sample volume or high MIS sample volume but poor outcome of MIS and LRH in centers with a low sample or low MIS sample volume. A retrospective analysis involving 116 Japanese centers, where 5964 women with FIGO IB1-IIB cervical cancer underwent radical hysterectomy, revealed a significantly decreased risk for recurrence (HR 0.69; 95% CI 0.57–0.84) and death (HR 0.75; 95% CI 0.59–0.95) in high-volume centers when compared with low-volume centers [74]. According to data reported by Matsuo et al., a population-based retrospective study queried the American National Inpatient Sample from 2007 to 2011, the centers favoring RRH were more likely to be small bed-capacity hospitals and less likely to be urban-teaching hospitals [75]. In this case, instead of being a reflection of proficiency of treatment centers, what higher RRH volume represented was just the other way around. These findings implied that whether MIS was comparable to ARH was center-associated, which was consistent with the findings by Gennari et al. that the treatment center remained a strong prognostic factor regarding recurrence-free survival (RFS) (high-volume vs. low-volume HR 0.49; 95% CI 0.28–0.83) and OS (high-volume vs. low-volume HR 0.50; 95% CI, 0.26–0.94) [29].

MIS was associated with increased risk of recurrence and progression in studies with a multi-center design but not in studies with a single center design. This difference might be partially due to the variances of centers involved in multi-center studies and single-center studies: ideally, in a single-center study, the center should be capable of providing sufficient number of MIS cases, while in a multi-center study, the centers with low sample volume should be limited. However, there was disproportion between number of included centers and number of patients as well as mixture of centers with different sample volume, bed capacity and different MIS technique level in several intercontinental, nation-wide and vast regional multi-center studies. Additionally, some local high-volume centers were not readily included in multi-center studies from the same region. For example, centers from studies reported by He et al. [60] and Yang et al. [5] were absent in the study reported by Chen et al. [40]. The absence of the local high-volume centers in multi-center studies could further augment the difference between the single-center studies and the multi-center studies and led to results favoring ARH. Compared to assessment of medical therapies, the assessment of surgical approaches was even more difficult due to heterogeneities of personal skills, surgical instruments, experiences of surgical team, supportive medication of complication and adjuvant therapy, let alone a surgery as challenging as minimally invasive radical hysterectomy. Even RCTs, the most rigorous study design, were difficult to conduct rigorously in the evaluation of some complex surgical interventions [76]. Therefore, the reported increased risk of recurrence of cervical cancer associated with MIS might reflect the uneven proficiency in the MIS technique around the world rather than the inferiority of the surgical approach itself [77, 78]. Center-associated factors such as center sample volume and experience of surgeons needed to be taken consideration in future evaluation of MIS hysterectomy.

Study inclusion was maximized and subgroup analyses based on characteristics of disease, publication, study design and treatment center were performed so as to get a general idea of actual oncological safety of MIS for cervical cancer among previous heterogeneous results. Meanwhile, several limitations came along with this study design. A few earlier studies did not report adjusting method for variable control and the cohort scale of these earlier studies was also relatively smaller as compared to that of recent multicenter studies. The sample volume and MIS sample volume for multi-center studies might not represent the actual surgical volume of each included center since most multi-center studies did not report the exact number of included patients from each center. We did not evaluate the potential impact of protective maneuvers on improving the oncological safety of the laparoscopic radical hysterectomy technique as suggested by Kampers et al. [79]. And we failed to perform stratified analyses based on the proficiency of the treatment center and the MIS technique of the surgeons.

Conclusions

Our findings highlight possible factors that contributed to inferior performance of MIS in cervical cancer including publication characteristics, center-geography and sample volume. Center associated factors were needed to be taken into consideration when evaluating complex surgical procedures like radical hysterectomy.