Introduction

As the prevalence of adult spinal deformity (ASD) in the elderly population increases, the need for corrective surgery has also increased [1]. One major surgical concern is the minimization of mechanical complications (MC), which include proximal junctional kyphosis/failure (PJK/PJF), distal junctional kyphosis/failure (DJK/DJF), rod breakage, or implant-related complications [2]. Given that the occurrence of MC is higher after ASD surgery than other orthopedic surgeries and that MC is associated with poorer clinical outcomes, many studies of its risk factors and preventive surgical techniques have been performed [3,4,5,6,7]. Accordingly, Yilgor et al. devised a new scoring system to assess the risk of MC in patients after correction surgery for ASD: the global alignment and proportion (GAP) score [2]. This score is a proportional method of analyzing the sagittal plane alignment in adult spinal deformity surgery based on the individual pelvic incidence (PI). It consists of age factor (younger or older than 65 years) and four parameters: relative pelvic version (RPV), relative lumbar lordosis (RLL), lordosis distribution index (LDI), and relative spinopelvic alignment (RSA) (Table 1). They proposed that the probability of MC after ASD correction surgery is closely related to GAP score category (proportioned [GAP-P], moderately disproportioned [GAP-MD], and severely disproportioned [GAP-SD]), and surgical plans to minimize GAP score can effectively prevent MC.

Table 1 GAP score parameters

Subsequent studies aimed to clarify GAP score validity; however, conclusions are lacking. Several cohort studies have suggested that the GAP score has adequate predictive power for MC [8, 9], whereas others reported little correlation between them [10,11,12]. Some studies compared the GAP score with other evaluation systems, including the Roussouly classification and Schwab classification, and demonstrated its possible ability to predict MC [8, 13]. Finally, the latest systematic review (SR) attempted to make a comprehensive analysis, but the contradictory results of preceding studies prevented a significant conclusion [14]. Therefore, this study aimed to determine the predictive power of the GAP score for MC and revision surgery via a meta-analysis.

Materials and methods

Search strategy

Online searches were performed on the PubMed, EMBASE, and Cochrane Central Register of Controlled Trials (CENTRAL) databases using the search terms shown in Table 2, which focused on GAP score and mechanical complications (or failure). The search was conducted on November 15, 2022, in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [15].

Table 2 Database search terms

Inclusion/exclusion criteria and data collection

The inclusion criteria were as follows: (1) publication between January 1, 2016, and November 15, 2022; (2) MC defined as primary research by Yilgor et al. [2]; (3) ≥ 4 fused vertebrae; (4) minimum 2 years of follow-up; and (5) availability of quantitative data (GAP score category and MC status). The exclusion criteria were as follows: (1) duplicate article or salami publication; (2) commentary article; (3) case report/series including ≤ 5 patients; (4) SR or meta-analysis; and (5) full-text unavailability. Studies that did not meet the inclusion criteria were excluded during the title/abstract or full-text reviews.

Demographic data collected in each study included mean age, male/female ratio, mean follow-up period, and mean number of vertebrae fused. To collect the number of patients by GAP score category and MC status, all data, shown as percentages, were manually converted into integral numbers. Odds ratios (OR) were calculated by combining the three GAP score categories into two categories. Therefore, the meta-analysis for OR was performed twice: one for GAP-P versus higher score groups and the other for GAP-SD versus lower score groups. For each analysis, studies for which OR could not be calculated were excluded.

While collecting the data, the authors decided to perform an additional analysis of the OR of revision surgery for the GAP score categories because the rate of revision surgery can be an indicator of severe MC.

Risk of bias assessment

All collected studies were assessed on their level of evidence through the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) [16]. The risk of bias was also evaluated using the Risk of Bias Assessment tool for Non-randomized Study (RoBANS) [17].

Statistical analysis

All statistical analyses were performed using R version 4.4.2. The risk of MC was described and evaluated by OR and 95% CI, while values of P < 0.05 were considered statistically significant. When I2 > 0.5, the random-effects model was used for the analysis; otherwise, the fixed-effect model was adopted.

Results

Study selection

From 184 articles initially collected from the three databases, 11 retrospective cohort studies were finally included in the analysis. The details of the inclusion/exclusion process are shown in the PRISMA flowchart (Fig. 1). The studies included a total of 1,617 patients who had undergone ASD correction surgery with a minimum of four levels of vertebral fusion; of them, 747 (46.2%) patients suffered MC (range, 27.7–60.6%). The rate of revision surgery was identified in nine studies (range, 11.7–34.9%). Among them, seven included the number of patients by GAP score category and revision surgery status, which were used for the following meta-analysis. The included studies had an average sample size of 159.4 (ranging from 39 to 322) and an average age range of 50.7–76.5 years. The demographic and operative data for the studies are presented in Tables 3 and 4.

Fig. 1
figure 1

PRISMA flowchart for study selection. GAP Global Alignment and Proportion; MC mechanical complication

Table 3 Demographic data of the included studies
Table 4 Operative data of the included studies

Validation of GAP score

In the analysis of the GAP-P and higher score groups, the meta-analysis for the OR of MC and revision surgery was performed in nine and five studies, respectively. The difference in MC rate between the GAP-P and higher group was significant (OR = 2.83; 95% CI = 1.20–6.67; P = 0.02); however, there was no difference in revision surgery (OR = 1.76; 95% CI = 0.70–4.40; P = 0.23). Both were analyzed using the random-effects model, and no significant publication bias was found. Forest and funnel plots for the analyses are shown in Fig. 2.

Fig. 2
figure 2

Forest plots and funnel plots of MC (a, b) and revision surgery (c, d) for the GAP-P and higher score groups. CI confidence interval P proportionate MC mechanical complications MD moderately disproportioned OR odds ratio SD severely disproportioned

For the GAP-SD and lower score groups, meta-analyses of the OR of MC and revision surgery were conducted in 11 and seven studies, respectively. Compared with the lower group, the GAP-SD group appeared to have a significantly higher rate of both MC (OR = 2.65; 95% CI = 1.57–4.45; P < 0.001) and revision surgery (OR = 2.27; 95% CI = 1.33–3.88; P = 0.003). Both were analyzed using the random-effects model, and significant publication bias was identified in the MC analysis. Forest and funnel plots are shown in Fig. 3.

Fig. 3
figure 3

Forest plots and funnel plots of MC (a, b) and revision surgery (c, d) for the GAP-SD and lower score groups. CI confidence interval MC mechanical complications MD moderately disproportioned OR odds ratio P proportionate SD severely disproportioned

Risk of bias assessment

The overall quality of evidence of the included studies was graded as low by GRADE. In the risk of bias assessment, nine of 11 studies had a high risk of selection bias caused by confounding variables. Details of the results are presented in Table 5 and Fig. 4.

Table 5 Level of evidence of included studies assessed by GRADE
Fig. 4
figure 4

Risk of bias of included studies assessed by the Risk of Bias Assessment tool for Non-randomized Study

Discussion

These results infer that the GAP score is valid for predicting MC to some extent. GAP score comprises sagittal alignment indices, the ideal values of which are determined by the PI of each patient using a simple linear model [2]. This review determined that the indices of the GAP score and their scoring thresholds might be precise enough to measure the risk of MC; the GAP-P group showed a significantly lower MC rate than higher groups, and the GAP-SD group higher than lower groups. Therefore, this result could justify practitioners planning ASD correction surgeries to minimize postoperative GAP scores and decrease the risk of MC.

In recent years, the precise assessment of sagittal alignment has become an essential tool in planning deformity correction surgeries. To determine the appropriate targets for the correction of ASD, the Scoliosis Research Society (SRS)-Schwab classification has been developed and widely adopted [18]. However, the use of PI-LL mismatch, pelvic tilt and sagittal vertical axis, the parameters in the SRS-Schwab classification, may occasionally be misleading as they are independently used as numerical values [19]. Even though these criteria are met after the surgery, mechanical complications still occur with some frequency [20]. In contrast, all the parameters incorporated in the GAP score are evaluated in relation to the patient’s PI. Given the significant variability in PI across the general population, it has become necessary to establish sagittal parameter targets in proportion to each individual patient’s specific PI. Additionally, Roussouly et al. have highlighted the potential impact of maldistribution between the lower arc (L4–S1) and the upper arc (L1–L3) in altering the distribution of loads within the spinal column, potentially leading to mechanical failure [21]. Therefore, proportional lumbar lordosis indices, specifically RLL and LDI in the GAP score hold significant importance. An additional component introduced in the GAP score is the subcategory of age. Multiple studies have shown that older age is a contributing risk factor for mechanical complications [21, 22].

Although the predictive power of the GAP score for revision surgery was not significant in the comparison of the GAP-P and higher groups, it may not imply a defect in the GAP score since other clinical or operative factors might have been involved in the decision to conduct revision surgery. The decision for revision surgery is made by considering patient pain, age, comorbidities, imbalances, and poor function; however, the GAP score, which is evaluated by postoperative sagittal imaging, focuses on unbalanced factors of ASD patients and considers few other factors. Moreover, this result might have been affected by the small number of samples included in the analysis of revision surgery. Only five articles with 1,003 patients were selected for the revision surgery analyses of the GAP-P and higher groups.

As mentioned in the introduction, a SR article on the same topic was published in September 2022 [14]. The article included 11 studies (1,517 patients) and suggested no significant difference in MC rate among the GAP score categories. The difference in conclusion between the SR article and this study may be due to the difference in analytic methods and included studies. For the statistical analysis, the SR article used the Kruskal–Wallis test for three categories and Pearson’s chi-squared test for comparing groups two by two. This study combined three groups into two and calculated OR to focus on the difference between the highest/lowest group and the others. These results are compatible in that the absence of a positive trend and the distinct difference between the two groups cannot exclude the possibility of a correlation among the three categories. Moreover, there was a difference in the included 11 studies: two studies included in the SR were excluded from this study due to an insufficient minimum follow-up period (one for 1 year, one for 6 months) [23, 24]. The minimum follow-up of 2 years for this study originated from the primary research by Yilgor et al., which might be appropriate given that most cases of MC (especially PJK) occur within 2 years postoperative [4]. The two articles newly included in this study met all of the inclusion criteria.

Nevertheless, this study has some limitations. First, the number of included studies and their level of evidence were insufficient. More studies and data are needed to obtain more convincing results regarding GAP score validity. Second, the analytical method of calculating OR does not fully reflect the stepwise structure of the three GAP score categories; the significance shown in this study cannot be interpreted as a positive trend. Third, this study did not consider other classification systems of ASD patients, such as Schwab classification, Roussouly classification, or GAP with body mass index and bone mineral density system [8, 13, 25]. Further research should investigate their potential versus the GAP score for predicting MC. Fourth, this study could not measure health-related quality of life (HRQoL) or other clinical outcomes of the included patients. As the concept of GAP score focused on mechanical problems, the subsequent studies did not explore differences in HRQoL. Further studies are needed to identify clinical differences.

Conclusion

This meta-analysis confirmed that the GAP score offers predictive value for the risk of mechanical complications in ASD correction surgery. Regarding the prediction of revision surgery, only the GAP-SD group shows significance. Therefore, it is advisable to approach its application to surgical planning with caution.