Introduction

Adult spinal deformity (ASD) surgery is prone to postoperative complications, leading to high reoperation rates [1,2,3,4,5,6]. Indeed, almost 20% of patients sustain a mechanical complication (MC) related to implants or bony structures, typically implant breakage or junctional failure, resulting in reoperation [1, 3,4,5,6]. Of all the primary complications leading to reoperation, MCs make up more than 60% [3,4,5, 7]. Although reoperations due to MCs have been shown to be cost-effective and sometimes less expensive than the index surgery itself, they still cause a significant economic burden on healthcare systems [8,9,10].

The primary aim of ASD correction surgery is to achieve a spinal alignment that does not require significant compensation mechanisms postoperatively and is economical to maintain when the patient is ambulatory [7, 11]. To achieve this goal, various classifications and scores have been developed [12, 13].

The Global Alignment and Proportion (GAP) score is a novel proportional model to predict all MCs [7]. The GAP score is based on a patient’s individual pelvic incidence. The premise is that not everyone benefits from the same radiological targets. According to the original validation study by Yilgor et al., the cut-off points of the GAP score were determined as follows: 0–2 to indicate a proportioned, 3–6 to indicate moderately disproportioned, and 6–13 to indicate severely disproportioned spinopelvic alignment [7]. The best predictive value for any MC was found with the GAP score ≥ 2 [7].

The aim of this study was to determine the cut-off point and the predictive value of the GAP score for those MCs requiring reoperation. A second aim was to investigate the cumulative incidence of the reoperated MCs during a long follow-up period.

Methods

The hypothesis of this study was that the GAP score is associated with the risk for MCs that require reoperation. The study was an analysis of prospectively collected data (diary number: 17U/2012). For the ASD patients who were operated at our institution Central Finland Central Hospital, Finland between 2008 and 2020, we used the following inclusion criteria: (1) patient age ≥ 18 years, (2) a minimum follow-up of two years, (3) a marked symptomatic sagittal spinal deformity (PI-LL > 10°, SVA > 5 cm and PT > 25°) and/or progressive symptomatic coronal thoracic or lumbar spinal deformity, and (4) the restoration of sagittal and coronal balance as to have the main indication for surgery. The exclusion criterion was the lack of standing full spine posterior-anterior and sagittal radiographs, which prevented the calculation of the GAP score.

Disability was assessed using the Oswestry Disability Index (ODI) and a separate Visual Analogue Scale (VAS) for leg pain (VAS-Leg) and back pain (VAS-Back) [14]. The severity of the deformity was assessed with the SRS-Schwab deformity classification [13].

The GAP score was measured and calculated from full spine standing posterior-anterior and sagittal radiographs taken preoperatively and at 0–3 months postoperatively. The GAP score was calculated using the following original formula: RPV + RLL + LDI + RSA + AF, where RPV is a relative pelvic version, RLL is a relative lumbar lordosis, LDI is a lordosis distribution index, and RSA is a relative spinopelvic alignment (Fig. 1) [7]. In the formula, AF indicates an age factor that is defined as an adult (< 60 years) or an older adult (≥ 60 years) [7].

Fig. 1
figure 1

Spinopelvic parameters in the formula of the GAP score. GT   global tilt; LL  lumbar lordosis; PI  pelvic incidence; and SS   sacral slope ©S. Hiltunen

MCs were evaluated from postoperative radiographs and patient records. As in the original validation study, proximal junctional failure (PJF) was defined as a fracture of the upper instrumented vertebra or one vertebra above, pullout of instrumentation at the upper instrumented vertebra, and/or sagittal subluxation [7]. Distal junctional failure (DJF) was defined as a fracture, implant pullout, or symptomatic ≥ 10° postoperative increase in kyphosis between the lowest instrumented vertebra and one vertebra below it. Proximal junctional kyphosis (PJK) was defined as over 10° increase of kyphotic angle between the lower endplate of the upper instrumented vertebra and the upper endplate of the second vertebra above during the follow-up without the need for surgery. PJK was not reported separately when a patient sustained a PJF leading to reoperation.

Statistical methods

The descriptive statistics are presented as means with standard deviation (SD), as medians with interquartile range, or counts with percentages. The relationship between the postoperative GAP score and the risk for MC requiring reoperation after ASD surgery was analyzed using generalized linear models. Receiver-operating characteristic (ROC) curves were used for the determination of thresholds for the MCs requiring reoperation. The diagnostic accuracy of the GAP score for MCs requiring reoperation was analyzed using the area under the curve (AUC), sensitivity, specificity, likelihood ratio, and positive predictive value. We defined the best cut-off point as the value with the highest accuracy that maximizes the Youden's index. Confidence intervals (95% CI) for the predictive values were obtained by bias-corrected and accelerated bootstrapping (10 000 replications). Cox proportional hazards regression was used to estimate the adjusted hazard ratios (HR) and their 95% confidence intervals. The relationship between the year of surgery and the operated MCs was analyzed using the Spearman's correlation test. Stata 17.0 (StataCorp LP, College Station, TX, USA) was used for the statistical analyses.

Results

Between 2008 and 2020, 144 ASD patients were operated to restore sagittal and coronal balance by three experienced spinal orthopedic surgeons at our hospital Central Finland Central Hospital, Finland. Two patients were excluded from the analysis because one was congenitally unable to stand and answer questionnaires and one patient died before postoperative radiographs were acquired.

Preoperative patient characteristics are described in Table 1. Of the 142 included patients, 96 (68%) were female. The mean (SD; range) age of the included patients was 65 (± 9; 22–81) years. The mean (SD; range) follow-up time for MCs was seven (± 3.3; 2–14) years. The mean (median; range) number of the fused levels was eight (nine; one to 16). In total, 57 (40%) patients had three column osteotomy (3CO), 62 (44%) had combined anterior interbody correction (ALIF) and posterior fusion, and 23 (16%) had posterior fusion and correction with multiple thoracolumbar Ponte osteotomies. Of the142 patients, 139 had posterior rods. In all posterior instrumentations, titanium alloy pedicle screws were used. The posterior rod material used was titanium alloy in 114 (82%) patients, cobalt-chrome in 24 (17%), and titanium-cobalt-chrome hybrid in one (1%). Two-rod construction was used in 90 (65%) patients, three longitudinal rods in 10 (7%), and four-rod constructs in 39 (28%). Three (2%) patients had anterior fixation or plating only associated with high angle (25–35°) anterior sagittal correction.

Table 1 Preoperative patient characteristics

The mean (SD) GAP score was preoperatively 7.7 (± 3.7) and postoperatively 4.4 (± 3.3) (Fig. 2).

Fig. 2
figure 2

Postoperative GAP score (0–13) distribution of the 142 patients after ASD surgery. Box-and-whiskers plot shows median and IQR, and whiskers indicate 5th and 95th percentiles

The optimal cut-off point to predict the risk for MCs that require reoperation was the GAP score ≥ 5. The discriminative power of the GAP score to predict MCs that require reoperation was good with an AUC of 0.70 (95% CI: 0.58 to 0.81) (Fig. 3).

Fig. 3
figure 3

ROC curve of the postoperative GAP score and risk for MCs that require reoperation after ASD surgery. Best predictive value for MCs that require reoperation was found with GAP score \(\ge\) 5 (moderate disproportion, score 3–6)

The risk for MCs that require reoperation was significantly lower when the postoperative GAP score was < 5 (HR 3.55, 95% CI: 1.40–9.02, p = 0.008) adjusted with age, sex, body mass index, number of fused levels, and diagnosis of osteoporosis or neuromuscular disease (Fig. 4).

Fig. 4
figure 4

Cumulative failure in patients after ASD surgery and hazard ratio (HR) adjusted with age, sex, body mass index, number of fused levels, and diagnosis of osteoporosis or neuromuscular disease. Patients were divided into different risk groups with the GAP score cut-off value 5

The total GAP score had a significant association with MCs that required reoperation (p = 0.003). The LDI (p = 0.002) and the RSA (p = 0.008) were the best parameters of the GAP score to predict MCs that require reoperation, while the RLL (p = 0.067), the RPV (p = 0.35), and the AF (p = 0.57) were the worst parameters to predict MCs that require reoperation (Fig. 5).

Fig. 5
figure 5

Postoperative GAP score and scoring subgroups adjusted with age, sex, body mass index (BMI), number of fused levels, and diagnosis of osteoporosis or neuromuscular disease. Statistically significant predictive value for rod breakage or proximal junctional failure was found in the GAP score, lordosis distribution index (LDI), and relative spinopelvic alignment (RSA). RPV  relative pelvic version, RLL  relative lumbar lordosis, AF  age factor, < 60 or \(\ge\) 60 years, mechanical complications indicate those mechanical complications that required reoperation

Altogether, 23 (16%) patients sustained 26 MCs (cumulative incidence of 18%) that required reoperation because of a risk for instability or the patient's symptoms were associated with a MC seen on the radiograph (Table 2). Three patients (2%) were operated separately for two different complications, PJF and RB. Reoperated MCs included 11 (42%) PJFs, one (4%) DJF, and 14 (54%) RB. Mean (SD, range) time to reoperation was 14 (± 17, 0.5–48) months for junctional failure and 30 (30, six to 120) months for RB. Of the 26 reoperated MCs, 17 (65%) occurred more than six months after surgery. Of all patients, 24 (17%) had PJK ≥ 10° in their latest radiograph without major local symptoms, symptomatic loss of sagittal balance, or risk for instability and, therefore, were treated conservatively.

Table 2 Incidence of mechanical complications that required reoperation

The number of the operated RBs (p < 0.001) and PJFs (p = 0.011) correlated with the year of surgery, with more incidents being in the earlier years of ASD surgery. The GAP score did not, however, correlate with the year of surgery (p = 0.239).

Discussion

The findings of this study confirm the association between the postoperative GAP score and MCs that require reoperation after ASD surgery. In the present patient cohort, the risk for sustaining an MC that requires reoperation was significantly higher when the postoperative GAP score was ≥ 5, indicating moderately disproportioned spinopelvic alignment. The cumulative incidence of MCs that required reoperation was 18%.

Several studies report conflicting results on the GAP score’s ability to predict MCs [15,16,17,18,19,20,21,22,23,24]. To our best knowledge, very few studies have specifically investigated the discriminatory power and threshold of the GAP score for those MCs that require reoperation. Yilgor et al. found that higher GAP scores were associated with higher rates of reoperated MCs [7]. Gupta et al. found the discriminative power of the GAP score increased as the years of follow-up increased [17]. Jacobs et al. reported a good ability of the GAP score to predict MCs [20]. Ham et al. investigated the accuracy of the GAP score to predict MCs specifically with patients with degenerative kyphoscoliosis [19]. In their study, the accuracy was lower when considering only those MCs that required reoperation, but moderate when considering all MCs [19].

Kwan et al., Bari et al., and Baum et al. reported the poor discriminatory power of the GAP score for both MCs and/ or reoperated MCs [15, 16, 24]. There were, however, differences in the studies, such as the length of the fusions, the rate of 3CO, and the age of the patients, which may explain the differences in the results compared to those in our study [15, 16, 24]. There was also variation in the studies as to whether neuromuscular diseases were included [7, 15,16,17, 19, 20, 24]. In our study, patients were not excluded based on the etiology of ASD. Therefore, the patient population was heterogeneous and included, for example, neuromuscular diseases and degenerative and post-traumatic spinal deformities, which may also explain the differences in the results.

In the present patient cohort, the LDI and the RSA were the best individual parameters of the GAP score to predict MCs that required reoperation, whereas the RLL, the RPV, and the AF were poorer in predicting operated MCs. To our knowledge, very few studies have assessed the predictive value of the GAP score parameters for MCs separately and the accuracy varied between studies [20, 21, 23, 25]. Indeed, the accuracy of the parameters for those MCs that require reoperation was not defined in the referenced studies.

Both Gupta et al. and Yilgor et al. defined the cut-off point of the GAP score for MCs to be ≥ 2 [7, 17]. The present study only included symptomatic and severe cases that required reoperation for MCs, whereas Yilgor et al. and Gupta et al. included all MCs [7, 17]. Therefore, the higher cut-off point in the present patient cohort is to be expected. Further, this suggests that patients in the present study tolerated greater disproportion before they sustained an MC and underwent reoperation. The patients in our study were also older compared to those in the studies of Yilgor et al. and Gupta et al. [7, 17]. Thus, it is possible that the patients in the present cohort had a lower level of physical demands and may, therefore, have had a lower risk for repetitive load leading to rod-related complications [7, 17].

Unfortunately, no surgical details were presented in the study by Yilgor et al., but the inclusion criterion was presented to be ≥ 4 levels of posterior instrumented fusion [7]. Therefore, it is not possible to compare whether the number of fused levels or other surgical methods could explain the differences in the cut-off levels. The study by Gupta et al. and the present study differed in, for example, the median length of fusion, the rate of 3COs, the rod constructs, and the rod materials used [17].

The other referenced studies focused mainly on assessing the accuracy of the GAP score for MCs and did not redefine the cut-off point [15, 16, 19, 20, 24]. However, as the reoperations after ASD surgery cause significant financial costs, it is important to recognize the threshold for severe MCs that require reoperation [8, 10]. Therefore, it would be useful for the surgeon in the planning of ASD surgery that the cut-off point of the GAP score for MCs had been evaluated between different groups. For example, according to the etiology of ASD or the surgical method used.

The cumulative incidence of the reoperated MCs in the present study is in line with previous studies [3,4,5, 7]. As the present study comprised only one DJF, the prediction analysis for DJF separately was not performed. Interestingly, the MCs correlated to the year of surgery, with surgery in the earlier years being more prone to MCs.

Our study confirms the validity of the GAP score in predicting MCs that require reoperation. The strength of this study is the relatively long follow-up time. To our best knowledge, there are very few long-term studies on the evaluation of the GAP score with more than five years of follow-up, as most studies meet a minimum follow-up period of two years [15, 16, 18,19,20,21,22,23]. In addition, we had an established clinical care team made up of highly experienced spine surgeons and nurses, and patient reachability for controls in the relatively stable catchment area was high. Furthermore, the large amount of prospectively collected data that extensively covered patient demographic data and exact time points as an example of MCs is a further strength of this study.

The limitations of this study are the relatively small number of patients and the heterogenous indications for ASD correction. In addition, the follow-up time of the patients who were operated in previous few years was uneven. Further, learning curve and the development of surgical techniques in ASD surgery may have biased the results, as those patients who underwent the most recent techniques tolerated a poor GAP score better. Also, the weakness of the study is that it does not take into account the effect of patient-reported outcome measurements (PROMs) on reoperations, although the decision to reoperation was made based on the patient's symptoms and radiographic MCs together. We are conducting another study where we evaluate PROMs and reoperations due to MCs among ASD patients. The predictability of clinical outcomes for reoperations due to MCs would also be worth of an investigation.

Conclusion

This study confirms the validity of the GAP score to predict MCs requiring reoperation after ASD surgery. The best predictive value for surgically treated MC was found with the GAP score \(\ge\) 5. The lordosis distribution index (LDI) and the relative spinopelvic alignment (RSA) were the best individual parameters of the GAP score to predict MCs. The cumulative percentage of surgically treated MCs was 18%. Bone-related MCs generally occurred earlier than implant-related MCs.