Introduction

Scoliosis is a three-dimensional deformity marked by coronal and sagittal curvature of the spine with varying degrees of spinal rotation [1]. In severe or progressive cases, surgical correction for spinal fusion is necessary [2]. One of the key components of surgical correction is the placement of pedicle screws,this allows for 3-column fixation and deformity correction manoeuvres. Scoliosis is associated with three-dimensional anatomical complexity including vertebral rotation and small dysplastic pedicles in the curve concavity [3]. The deformity encountered in scoliosis makes pedicle screw placement technically challenging which elevates the risk of screw misplacement as well as potential complications such as neurological injury, visceral injury, or revision surgery [4].

At present, the primary approach for pedicle screw implantation is the conventional free-hand technique (CF) [5]. Despite careful pedicle tapping for accurate determination of screw pathway, pedicle screws may be inaccurately placed due to atypical anatomical complexity including axial rotation as well as pedicle calibre [1]. Screw misplacement continues to be the predominant form of instrument-related complication in scoliosis surgery, posing a significant concern for both patients and spinal deformity surgeons [5].

Various techniques have been developed to assist accurate pedicle screw insertion, including spinal navigation systems (NS) and robot-assisted (RA) technologies [6]. Navigation involves computerized image processing visualization system that provides crucial intraoperative assistance for screw placement. This is commonly in the form of 3D fluoroscopy or intraoperative computed tomography (CT) scan to monitor the patient’s anatomical position along with infrared stereoscopic positioning technology to track the surgical instrument location, ensuring high precision [7].

In recent years, there has been extensive interest and research relating to the role of robot-assisted (RA) technology in the spine surgery [8]. The goal of RA surgery is to address manual surgeon errors, commonly seen in more conventional techniques, and allowing better surgical planning [9]. This primarily consists of a robotic arm, an optical tracking system, and a surgical navigation system,together these components work to establish a clear surgical plan with precise pedicle screw trajectories and real-time monitoring. The challenge with this system is the absence of tactile feedback during screw placement [10].

The objective of this study is to evaluate and contrast the safety and efficacy of distinct pedicle screw insertion methods including CF techniques, NS, and RA surgery. More specifically RA surgery will be compared to both NS and CF techniques. To the best of our knowledge, this is the first systematic review and meta-analysis to consider these two comparisons together, specifically within the scoliosis population.

Methods

A systematic review and meta-analysis were conducted as per the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [11].

Eligibility criteria

The aim was to assess and compare RA surgery to NS and CF surgery. All observational studies directly comparing RA to either one of these groups were included. Scoliosis deformity was defined as greater than 10 degrees measurement of main thoracic coronal cobb angle.

Primary outcomes

The primary outcome was acceptable pedicle screw placement as per the Gertzbein-Robbins grading system [12]. This classification system classifies pedicle screw position into 5 grades (A-E) based on postoperative CT. A grade A screw has no breach of the pedicle cortex. A grade B has a breach < 2 mm. A grade C has a breach between ≥ 2 but < 4 mm. A grade D has a breach ≥ 4 mm. A Grade E has a breach of ≥ 6 mm. Pedicle screws classified as either A or B are considered clinically acceptable.

Secondary outcomes

Secondary outcomes included: radiation exposure in (mSV), operation duration in minutes (mins) and estimated blood loss (EBL) in millilitres (mL). length of hospital stay (LOS), deformity correction rate (percentage change in cobb angles), Total Scoliosis Research Society SRS-Score, postoperative Visual Analogue Score (VAS) for pain and postoperative Japanese Orthopaedic Association (JOA) score.

Literature search strategy

A search of electronic databases of the following databases was performed by two authors independently: MEDLINE, EMBASE, and the Cochrane Central Register of Controlled Trials (CENTRAL). The last search was run on the 14th of November 2023. In addition, World Health Organization International Clinical Trials Registry (http://apps.who.int/trialsearch/), ClinicalTrials.gov (http://clinical-trials.gov/), and ISRCTN Register (http://www.isrctn.com/) were searched for any ongoing or unpublished studies. No language restrictions were applied in our search strategies. The search terminologies included ‘robot’, ‘deformity’, ‘scoliosis’, ‘navigation’, ‘CT’, ‘freehand, ‘fluoroscopy’, ‘O-arm’, ‘C-arm’.

Selection of studies

Two authors independently assessed the titles and abstracts of the identified studies. Full-texts of relevant studies were obtained and those that met our eligibility criteria were chosen. Any discrepancies in study selection were resolved via group discussion between the authors.

Data extraction and management

As per the Cochrane’s data collection form for intervention reviews, a spreadsheet was pilot-tested in randomly selected articles and was adjusted accordingly. This sheet included study-related data, baseline demographics of the included patients, as well as primary and secondary outcome data. Our data extraction spreadsheet included study-related data (first author, year of publication, country of origin of the corresponding author, journal in which the study was published, study design, study size, clinical condition of the study participants, type of intervention, and comparison), baseline demographics of the included populations (age and gender) and primary and secondary outcome data. Two authors collected and recorded the findings and any disagreements were resolved through discussion.

Data analysis

Review Manager 5.3 software was used for data analysis. The collected data was entered into the software by two independent authors. A fixed effects model was used for outcomes with a heterogeneity less than 50%. A random effects model was used for outcomes with heterogeneity over 50%. 95% Confidence Intervals (CIs) were used in the forest plots. For dichotomous outcomes, the Odds Ratio (OR) was calculated between the two groups whereas for continuous outcomes, the Mean Difference (MD) was used.

Assessment of heterogeneity

Study heterogeneity was assessed using the Cochran Q test (χ2) and was quantified by calculating I2. It was interpreted as follows: 0 to 25% as low heterogeneity, 25 to 75% as moderate heterogeneity, and 75 to 100% as high heterogeneity.

Quality assessment

Using the Cochrane collaboration tool, risk of bias was assessed in randomized studies. The quality of all non-randomized studies was assessed via the Newcastle Ottawa Scale; this involves a star system to analyse study selection, comparability and outcome [13].

Sensitivity analysis

A sensitivity analysis was carried out looking at the role of individual studies on the result of the forest plot. This assesses for any skewing of the results by any one study. One study was excluded from the analysis at any one time to look at the impact of any one study on overall significance. Individual studies or those with a high risk of bias did not independently impact the significance of the data. This was supported by funnel plots analysis.

Results

Literature search results

The literature search identified 291 studies and after a thorough screening of the retrieved articles a total of 10 studies met the eligibility criteria (Fig. 1).

Fig. 1
figure 1

Prisma Flow Diagram. The PRISMA diagram representing the search and selection processes applied during the overview. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses [11]

Baseline characteristics

The baseline demographic data of included studies can be seen in Table 1.

Table 1 Baseline characteristics of included studies. RA vs NS vs CF

Primary outcome—RA vs. NS

Screw placement accuracy (% A + B accuracy)

In Fig. 2, screw placement accuracy is compared between the RA and NS groups. From a total of 4 studies, a total of 5556 screws were placed. There were greater odds of placing pedicle screws in a clinically acceptable position in the RA group relative to the NS group (OR = 2.02, CI = 1.52–2.67, P < 0.00001). The level of heterogeneity was moderate (I2 = 49%, P = 0.12).

Fig. 2
figure 2

Forest plot for screw placement accuracy in RA vs NS

Secondary outcomes for RA vs. NS

Operation duration (minutes)

In Fig. 3, operation duration is compared between the RA and NS groups. From a total of 4 studies, 368 patients were enrolled. Operation durations were significantly greater in RA (MD = 10.74, CI = 3.52–17.97, P = 0.004). The level of heterogeneity was low (I2 = 0%, P = 0.59).

Fig. 3
figure 3

Forest plot of operation duration in RA vs NS

Radiation exposure (mSV)

Figure 4 reports radiation exposure in 3 studies with a total of 318 patients. No statistical significance was seen between the two groups (MD = –2.5, CI-7.66–2.66, P = 0.34) in terms of radiation exposure (mSV). Heterogeneity for this analysis was high (P = 100%, P < 0.00001).

Fig. 4
figure 4

Forest plot of radiation exposure (mSV) in RA vs NS

Estimated blood loss (milliliters)

Figure 5 reports EBL in 3 studies with a total of 318 patients with no statistically significant difference between the groups (MD = 4.02, CI = -41.49–49.53, P = 0.86). A low level of heterogeneity was present (I2 = 0%, P = 0.63).

Fig. 5
figure 5

Forest plot of estimated blood loss (mL) in RA vs NS

Length of hospital stay (LOS) in days

In Fig. 6, LOS was reported in 3 studies enrolling a total of 318 patients with no statistically significant difference between the two groups (MD = –0.26, CI = –0.78–0.25, P = 0.32). A low level of heterogeneity was present (I2 = 11%, P = 0.32).

Fig. 6
figure 6

Forest plot for length of hospital stay postoperatively

Primary outcome for RA vs. CF

Screw placement accuracy (% A + B accuracy)

As seen in Fig. 7, screw placement accuracy was reported in 6 studies with a total of 7164 screws inserted. The RA group had significantly greater odds of placing screws with greater accuracy and in a clinically acceptable position (OR = 3.06, CI = 1.79–5.23, P < 0.00001). Study heterogeneity was high (I2 = 87%, P < 0.00001).

Fig. 7
figure 7

Forest plot for screw placement accuracy in RA vs CF

Secondary outcomes for RA vs. CF

Operation duration (minutes)

Figure 8 reports operation duration in 6 studies enrolling a total of 514 patients. The RA group had significantly greater operation durations than the CF group (MD = 40.27, CI = 20.90, P < 0.0001). Heterogeneity was high (I2 = 90%, P < 0.00001).

Fig. 8
figure 8

Forest plot of operation duration in RA vs CF

Radiation exposure (mSV)

Radiation exposure was reported in 4 studies enrolling 316 patients (Fig. 9). No significance was noted between the groups (SMD = 2.06, CI = -1.7–5.83, P = 0.28). Heterogeneity was high (I2 = 98%, P < 0.00001).

Fig. 9
figure 9

Forest plot of radiation exposure in RA vs CF

Estimated blood loss (milliliters)

EBL was reported in 7 studies with a total of 560 patients (Fig. 10). No significance was noted between the two groups (MD = -0.44, CI = -71.93–71.04, P = 0.99). A high level of heterogeneity was found amongst the studies (I2 = 93%, P < 0.00001).

Fig. 10
figure 10

Forest plot of estimated blood loss

LOS in days

Length of hospital stay was reported in 5 studies enrolling a total of 413 patients (Fig. 11). There was no statistically significant difference between the groups (MD = -0.18, CI = -0.49–0.14, P = 0.28). A medium level of level of heterogeneity was found amongst the studies (I2 = 48%, P = 0.11).

Fig. 11
figure 11

Forest plot of length of hospital stay

Cobb angle correction rate (%)

Cobb angle correction rate was reported in three studies enrolling a total of 203 patients (Fig. 12). Comparing the two groups, no statistically significant difference was seen (MD = 1.14, CI = -0.59–2.87, P = 0.20). A medium level of heterogeneity was present (I2 = 0%, P = 0.67).

Fig. 12
figure 12

Forest plot of cobb angle correction rate

Total SRS-score

Total SRS score was reported in two studies enrolling a total of 198 patients (Fig. 13). Comparing the two groups, no statistically significant difference was seen (MD = 0.07, CI = -0.06–0.20, P = 0.26). A medium level of heterogeneity was present (I2 = 0%, P = 0.65).

Fig. 13
figure 13

Forest plot of Total SRS score in RA vs CF

Postoperative VAS pain score

Postoperative pain VAS score was reported in two studies enrolling a total of 86 patients (Fig. 14). Comparing the two groups, no statistically significant difference was seen (MD = -0.08, CI = -0.27–0.10, P = 0.39). A low level of heterogeneity was present (I2 = 0%, P = 0.80).

Fig. 14
figure 14

Forest plot of postoperative VAS pain score in RA vs CF

Postoperative JOA score

Postoperative JOA score was reported in two studies enrolling a total of 86 patients (Fig. 15). Comparing the two groups, no statistically significant difference was seen (MD = -0.47, CI = -1.44–0.51, P = 0.35). A medium level of heterogeneity was present (I2 = 54%, P = 0.14).

Fig. 15
figure 15

Forest Plot of postoperative JOA score in RA vs CF

Complications

Table 2 represent the complications across different studies. RA shows superior outcomes in terms of screw placement accuracy compared to NS and CF. Additionally, NS shows superior screw placement accuracy relative to CF. No significant difference was seen in neurological complications or surgical revision rates. Revision surgery was mainly due to neurological or screw-related complications including loosening or malposition however reporting of this data is limited. Table 3

Table 2 Surgical complications
Table 3 Quality assessment of nonrandomized studies using Newcastle–Ottawa classification

Quality assessment results

Overall, all studies were of high quality based on the Agency for healthcare and research quality (AHRQ) standards. Quality of the non-randomized studies was assessed utilising the Newcastle Ottawa Scale which uses a star system to analyse selection, comparability and outcome. All 10 nonrandomized studies demonstrated a high quality of patient selection with clear inclusion and exclusion criteria. Patients who underwent both RA and CF were obtained from the same database. Clear comparability was found in most studies with similar preoperative patient characteristics including age, BMI, type of scoliosis and cobb angles. Follow up duration was adequate, but not enough in some studies to assess postoperative outcome measures such as VAS and ODI.

Discussion

Relative to both NS and CF techniques, RA surgery was superior in terms of pedicle screw placement accuracy with significantly greater odds of achieving clinically acceptable pedicle screw positioning. The downside with RS relative to both other groups was the significantly greater operation durations. What is important to note is that intraoperative and postoperative outcomes between the groups were all comparable including: EBL, radiation exposure, LOS, cobb angle correction rate, SRS-score, postoperative VAS pain score and postoperative JOA score.

The role of RA surgery in orthopaedics and spine surgery is still evolving but many studies have demonstrated associations with enhanced intraoperative and postoperative outcomes [24, 25]. A meta-analysis looking at RA knee arthroplasty showed better component positioning and alignment relative to conventional methods, however similar to our study, operation durations were significantly prolonged in the RA group [26]. Another meta-analysis looking at RA hip arthroplasty demonstrated greater implant accuracy and reduced limb length discrepancies. Despite these advantages, operation durations were also extended with no significant differences in complications and implant positioning [27]. Sun et al. performed a meta-analysis of 20 RCTs comparing RA to CF in spine surgery. The cohort consisted mainly of patients with traumatic fractures and degenerative changes. Similar to our results, they showed increase screw placement accuracy with RA with minimal clinical benefits [28].

Owing to anatomical complexity, small pedicle sizes and challenging vertebral rotation in scoliosis, the risk of misplacement and clinical complications is higher. Within spine surgery, the role of RA is primarily to improve the accuracy of pedicle screw insertion [29]. Screw placement accuracy is vital to avoid neurologic, vascular or visceral harm as well as the need for revision surgery [30]. With CF surgery, screw misplacement rates can range from 2 to 31% and is significantly dependent on surgical expertise [31]. Most commonly the Gertzbein-Robins scale is used which grades screw position from A to E, with screws being clinically acceptable if graded A or B [12]. A study by Zhang et al. showed significantly greater rates of clinically acceptable screws in RA (98.3%) relative to CF (93.6%) (p = 0.024). Compared to NS, RA also achieves higher screw placement accuracy although the difference is less than that for CF [32].

It would be useful to understand the role of RA and NS in aiding complex pedicle screw insertion, particularly at the concavity of the curve apex where pedicles are dysplastic. This was only assessed by Chao Li et al. 2023, who showed that concave sided pedicle screw misplacement was less in RA relative to NS and CF. The reported rates of lateral sided concave pedicle screw deviation were 1.4, 2.2 and 10.8% in RA, NS and CF respectively. On the medial concave side, RS and NS had no occurrence of pedicle screw misplacement whereas, CF had a reported rate of 3.9% [16].

Screw placement malposition may result in major complications such as CSF leak, nerve root irritation, vessel damage or even spinal cord injury [33]. However, despite the greater screw placement accuracy in RA relative to NS and CF, postoperative outcomes are comparable. This is widely supported across the literature where postoperative cobb angle correction and outcome measures such as VAS and ODI are similar between groups postoperatively [34,35,36]. Additionally, rates of neurological injury and surgical revision, are also similar between the groups [8, 15, 37]. This is mainly because neurological complications arise from deformity correction as oppose to pedicle screw placement [8].

Minimizing intraoperative radiation exposure is imperative, both for the surgical team and the patient. Although our meta-analysis did not show any difference in radiation exposure between the groups, a meta-analysis for studies in spine surgery in general showed reduced radiation exposure with RA relative to CF [34]. Khan et al. who compared radiation doses in RA to NS showed no significant difference between the two in radiation exposure [35].

Theoretically, RA surgery should reduce cognitive and technical load and thus make surgery both faster and more accurate however, real-world data remains controversial [33]. A lower operation duration is important as it is associated with a reduced risk of surgical site infection and shorter postoperative hospital stay. Similar to our study, a meta-analysis of RCTs for spine surgery in general shows significantly longer operation durations in RA surgery however, controversy still exists within the literature [34, 35]. It is important to consider impact of the learning curve associated with RA, expertise of the surgical team including radiographers, time for registration, as well as the type of robot used [38]. A study comparing operation durations throughout the learning curve showed reduced operation durations for posterior spinal fusion after 17 to 18 cases [39]. The cost of RA systems as well as any associated training required must not be neglected when deciding between techniques, especially since evidence for improved clinical outcomes with RA is still limited [40].

RA surgery has become routine practice in many surgical specialties; with this comes growing challenges and future considerations. Urologists have successfully managed to adopt the da-Vinci robot into routine care, since FDA approval in 2001 and much can be learnt from this process within orthopaedics to improve implementation, tackle the learning curve and ensure patient safety [41]. It is important to acknowledge the imperfection of robotic systems and understand the technical difficulties surgeons may face including equipment failure and other robot-related complications [42]. System failures should not compromise patient safety and overall care, and surgeons should be able to continue the procedure should they fail [43].

The dilemma with this is that if RA becomes routine practice, then how will future generations be trained on traditional techniques, and should robotic surgery be part of the standardized curriculum [44]. Intraoperative neuromonitoring, which is used to assess neurological injury in spine surgery, is a valuable tool that has been implemented as routine practice in many places [45]. When neuromonitoring alerts occur, checklists are commonly used to assess the patient and ensure that the team takes systematic and standardized actions to maintain patient safety [46]. Similarly, to address robotic system malfunction, it would be useful to develop checklists and standardized work processes to reduce variability, improve team work and patient safety [43]. Detailed reporting of major robotic complications within the literature is necessary to allow us to tackle challenges, standardize workflow and improve care. Considering the limited evidence of clinical improvement with robotics in spine deformity, as well as the implementation challenges, it is important that the cost–benefit analysis is carefully assessed.

To our knowledge, this study is the first systematic review to assess the role of robotics in scoliosis specifically. Moreover, it is the first systematic review comparing RA to both NS and CF. With a total 14 forest plots, this review assesses a wide range of operative and postoperative outcomes. Despite these strengths, our study is not without limitations. This study is based on a total of 10 studies only most of which are retrospective in nature. Study heterogeneity exists, particularly with regards to surgical technique and type of robot used. It would also be useful for future studies to assess the cause of revision surgery and neurological complications in RA and NS as data from current studies is limited. This paper supports previous claims that RA and NS are superior to CF in accuracy but fails to show significant clinical benefits [33]. Moving forwards, it is important to consider performing larger prospective trials assessing the role of robotics in scoliosis correction, as well as cost and training repercussions [33].

Conclusion

In scoliosis, RA surgery offers greater pedicle screw insertion accuracy than NS and CF however, RA operation durations are significantly longer. Intraoperative and postoperative outcomes are comparable between the groups. Larger trials looking at RA in scoliosis correction are needed to help clarify the relationship between these technologies, especially with regards to scoliosis. It is important to acknowledge the pros and cons of RA surgery and consider the role of this technology in future practice and surgical training.