Introduction

Lumbar spondylosis can lead to back pain, radiculopathy, reduced quality of life, and work absenteeism. The rate of elective lumbar fusion for degenerative diagnoses has been increasing over the past two decades, with patients presenting with spondylolisthesis accounting for most of these operations in the USA [1]. Lumbar fusion is performed in adults with chronic and debilitating symptoms resulting from degenerative disc disease (DDD) and lumbar spondylolisthesis. Several fusion options exist, including anterior lumbar interbody fusion (ALIF), posterior lumbar interbody fusion (PLIF), transforaminal lumbar interbody fusion (TLIF), and posterolateral lumbar fusion (PLF). There is no consensus on the optimal technique for patients with DDD or spondylolisthesis. Clinical decision-making may be guided by patient history, physical examination, radiology, surgeon specialty, training, and preference [2].

ALIF provides some advantages over posterior fusion, including excision of the entire disc with restoration of disc height and lordosis, direct and indirect neural decompression, a larger cross-sectional area of graft material, insertion of a cage and graft under compression, and avoidance of damage to the erector spinae and the posterior ligamentous structures. The disadvantages include the risk of injury to vessels, viscera, parasympathetic and sympathetic nerves, and muscular atony of the abdominal wall [3]. ALIF procedures may be combined with posterior instrumentation (360-degree fusion). With the appropriate patient selection, stand-alone ALIF may be a more effective treatment method for symptomatic DDD and spondylolisthesis. It also avoids additional surgical and hospital recovery times, blood loss [4], and potential costs and morbidity associated with further posterior surgery. The advantages of stand-alone ALIF concerning patient-reported outcome measures (PROMs) still need to be determined.

Previous studies have included complex populations, including patients with spinal deformity, long constructs, prior spinal fusion, and advanced morbidities. Similarly, different anterior or posterior fusion approaches have been aggregated (e.g. stand-alone ALIF with 360-degree fusion). To improve our understanding of lumbar fusion procedures, we performed a systematic review and meta-analysis comparing stand-alone ALIF in patients with DDD and spondylolisthesis with posterior lumbar fusion techniques (PLIF/TLIF/PLF).

Methods

This systematic review compared stand-alone ALIF with posterior lumbar spinal fusion in adults with back or leg pain, with a minimum of 6 weeks of nonoperative management. Randomised trials, cohort studies, and case series were included. Case reports and studies evaluating uninstrumented posterior fusion, long constructs, kyphosis/scoliosis, or circumferential fusion were excluded. The details are reported in the Prospero register [5].

Identification of relevant trials

Studies were sought by searching MEDLINE, EMBASE, and the Cochrane Register of Trials from inception to February 2022 without language restriction (see Additional File 1) according to the general principles of PRISMA-S [6]. The reference lists of the included studies and the literature reviews were searched for further studies. In addition, unpublished studies were sought by searching ‘grey’ literature using OpenGrey [7] and ClinicalTrials.gov [8] databases.

Study selection, data extraction, and analysis

Three reviewers independently screened titles and abstracts (JR, MR, and DN). The full-text reports of potentially relevant studies were inspected. Where uncertainty arose, a resolution was achieved through discussion. Two reviewers (JR, SL) extracted study data, and quality was assessed using the Cochrane Back Review Group’s Risk of Bias criteria [9]. For those studies that only reported the median, interquartile range, or minimum and maximum values instead of means and standard deviations (SD), these were derived according to Hozo [10], Wan [11] and Walter [12].

The studies were analysed using Review Manager (RevMan [Computer program]. Version 5.4.1, The Cochrane Collaboration, 2020). Relative risk and 95% confidence interval (CI) were calculated for dichotomous data. For continuous data, the mean difference and 95% CI were calculated. A random-effects model was chosen to account for studies that estimated other related intervention effects. Thus, CIs for the average intervention effect will be more expansive, and the statistical significance will correspondingly be more conservative. Heterogeneity among the studies was tested using the chi-square test and I2 statistics.

Results

A total of 16,435 records were identified from database searches, and the included studies' reference lists were inspected (Fig. 1). Following deduplication and exclusion of irrelevant studies, 94 full-text articles were examined for relevance. After further scrutiny, 21 studies were included. The search within OpenGrey and ClinicalTrials.gov databases did not identify additional studies. An updated search performed in February 2022 (n = 2147 records) did not identify any new studies.

Fig. 1
figure 1

PRISMA flow diagram of study searches and study selection

Studies were conducted in eight countries: the USA (n = 10), Germany (n = 2), China (n = 2), Japan (n = 2), South Korea (n = 2), Denmark, Italy, and the UK. One study randomised patients to stand-alone ALIF or PLF, and the remaining 20 used non-randomised cohort designs. Spondylolisthesis and DDD were the primary diagnoses. The average age of the participants was 46 years, and the study size ranged from 21 to 969 (Table 1).

Table 1 Characteristics of studies

Four studies compared ALIF with PLIF [13,14,15,16]. Eight studies compared ALIF to TLIF [4, 17,18,19,20,21,22,23]. Six studies compared ALIF with PLF [24,25,26,27,28,29]. Three studies compared ALIF to multiple posterior fusion techniques [30,31,32].

Seventy-three full-text articles were excluded. Seven were excluded because posterior fusion arms (e.g. PLIF and TLIF) were combined and reported as a single comparator; 13 due to patients not satisfying the inclusion criteria (e.g. spinal deformity, revision surgery); 19 were excluded because ALIF used 360-degree fusion rather than stand-alone fusion; nine, because ALIF was compared with other surgical approaches (e.g. Extreme Lateral Lumbar Interbody Fusion), or non-contemporary techniques (uninstrumented PLF); and 25 studies were excluded because no relevant outcome data were available.

Risk of bias assessment

The methodological quality of the included studies is shown in (Fig. 2).

Fig. 2
figure 2

The methodological quality of studies summarising the risk of bias

The quality of the studies could have been more evident because of inconsistent reporting. Most studies did not report attrition or whether assessors were blinded to the treatment groups. Baseline prognostic risk factors (PRF) were not reported in 11 studies; four studies stated that baseline PRF was significantly different due to imbalances in age [14], PROMs [24], bone mineral density and spondylolisthesis [31], and smoking status [29]; six studies reported no significant difference in PRF across surgical groups, and one study used a randomised design with an overall low risk of bias [26]. The remaining 20 studies were prospective or retrospective observational studies in which the overall risk of bias was high or unclear. All non-randomised studies were at increased risk of allocation bias. Surgeons could not be blinded to the intervention, and most studies did not report how the fusion method was chosen. Four studies stated how the fusion approach was chosen. In one study [31], patients with primary lower back pain received ALIF, patients with single-leg pain received TLIF, and patients with neurogenic claudication received PLIF. Other single studies selected the fusion method for a specific pathology [29], were surgeon-dependent [33] or influenced by patients' preference for one type of surgical procedure [16]. Another surgeon-dependent study reported that the surgical approach used the same operative criteria for all surgical interventions [25]. One study stated that the surgical procedure depended on the timing of surgery; therefore, there was no bias in the severity of symptoms between groups [27]. However, the evidence for selective outcome reporting could be more precise. The risk of bias due to imbalances between the timing of the outcomes was primarily low or unclear.

Main results

The results compared stand-alone ALIF with three posterior fusion approaches ALIF vs. TLIF, ALIF vs. PLIF, and ALIF vs. PLF (Table 2).

Table 2 Outcome measures reported across the ALIF and posterior fusion comparators

Stand-alone ALIF versus TLIF

Surgical time

Seven studies reported surgical time with high heterogeneity (I2 = 93%). The surgical time (min) was significantly lower for ALIF (n = 969, MD − 48.16, CI − 70.86, − 25.45) than for TLIF (Fig. 3).

Fig. 3
figure 3

Mean surgical time (min) stand-alone ALIF vs. TLIF

Blood loss

Blood loss (ml) was significantly lower in patients undergoing ALIF than in those undergoing TLIF (n = 480, MD: − 192.65, CI: − 256.41, − 128.90), but with high heterogeneity (I2 = 94%) (Fig. 4).

Fig. 4
figure 4

Mean blood loss (ml) stand-alone ALIF vs. TLIF

Length of stay

Length of stay (LOS) (days) was significantly shorter in patients undergoing ALIF (n = 883, MD − 0.71, CI − 1.42, − 0.00) than in those undergoing TLIF but with high heterogeneity (I2 = 76%) (Fig. 5).

Fig. 5
figure 5

Length of stay (days) stand-alone ALIF vs. TLIF

Fusion rate

Fusion rates at one year (2 studies, n = 111, RR 1.00, CI 0.95, 1.05) and two years (n = 17, MD 0.95 CI 0.57 to 1.59) indicated no significant difference between ALIF and TLIF.

Patient-reported outcome measures

Visual analogue scale

Visual analogue scale (VAS) scores for back pain were not significantly different between ALIF and TLIF at one year (4 studies, n = 1192, MD 0.20, CI − 0.30, 0.70) and two years (3 studies, n = 1507, MD − 0.18, CI − 0.50, 0.14). VAS leg pain scores at one year (3 studies, n = 1060, MD − 0.22, CI − 0.53, 0.09) and two years (3 studies, n = 1507, MD − 0.27, CI − 0.59, 0.06) were also not significantly different between ALIF and TLIF.

SF-36

The physical component scores on the SF-36 scale at one year (n = 906, MD 0.96, CI − 0.97, 2.89) and two years (2 studies, n = 1425, MD − 0.07, CI − 2.64, 2.51) follow-up were not significantly different between ALIF and TLIF.

The mental component scores on the SF-36 scale at one year were not significantly different (p = 0.07, n = 906, MD 2.15, CI − 0.18, 4.48). Also, two year data indicated no significant difference (2 studies, n = 1425, MD 0.02, CI − 1.96, 2.01) between ALIF and TLIF.

Oswestry disability index

Oswestry Disability Index (ODI) scores at one year (4 studies, n = 1217, MD − 2.63, CI − 6.76, 1.50) and two years (3 studies, n = 1507, MD − 0.02, CI − 1.90, 1.86) follow-up were not significantly different between ALIF and TLIF.

Adverse events

Adverse events were low and not significantly different between ALIF and TLIF. One study categorised adverse events as serious (n = 642, RR 1.62, CI 0.68, 3.84) and minor (n = 642, RR 1.11, CI 0.59, 2.10), with no significant differences between ALIF and TLIF. Readmission rates were not significantly different (n = 642, RR 2.06, CI 0.91, 4.67), although TLIF generally showed lower rates. Complications of surgery (2 studies, n = 154, RR 1.15, CI 0.27, 4.84), postoperative complications (2 studies, n = 566, RR 0.65, CI 0.40, 1.08), and adjacent segment disease (ASD) within two years (n = 47, RR 0.32, CI 0.07, 1.50) were not significantly different between ALIF and TLIF.

Stand-alone ALIF versus PLIF

Surgical time

Surgical time (min) was significantly shorter (n = 146, MD − 41.62, CI − 61.87, − 21.37) in patients who underwent ALIF than in those who underwent PLIF (Fig. 6).

Fig. 6
figure 6

Surgical time (min) stand-alone ALIF vs. PLIF

Blood loss

The volume of blood loss (ml) was significantly lower (n = 146, MD − 186.61, CI − 355.18, − 18.04) in patients who underwent ALIF than in PLIF (Fig. 7).

Fig. 7
figure 7

Blood loss (ml) stand-alone ALIF vs. PLIF

Length of stay

LOS (days) was not significantly different (2 studies, n = 88, MD − 1.00, CI − 2.49, 0.48) between ALIF and PLIF.

Fusion rate

Fusion rates were not significantly different between ALIF and PLIF at one year (5 studies, n = 187, RR 0.92, CI 0.68, 1.24) or two years (2 studies, n = 109, RR 1.06, CI 0.97, 1.17).

Patient-reported outcome measures

Visual analogue scale

VAS back pain scores (n = 56, MD 0.76, CI − 0.04, 1.56) and leg pain scores (n = 74, MD 0.20, CI − 0.85, 1.25) were not significantly different.

Oswestry disability index

ODI scores up to 18 months (n = 58, MD 5.10, CI − 0.51, 10.71) and two years (n = 74, MD 2.40, CI − 6.40 to 11.20) were not significantly different between ALIF and PLIF. The ODI scores categorised as ‘not improved’ were also not significantly different (n = 74, RR 1.03, CI 0.41, 2.54) between ALIF and PLIF.

Prolo scale

One study (n = 28) using the modified Prolo scale found no significant difference (RR 2.08, CI 0.88, 4.91) between ALIF and PLIF.

Adverse events

Postoperative deep vein thrombosis, hematoma, need for revision surgery, and ileus was low and not significantly different between the ALIF and PLIF groups. The rates of adjacent segment degeneration within two years, postoperative complications, need for blood transfusions, and dural tears revealed no significant difference between ALIF and PLIF.

Stand-alone ALIF versus PLF

Surgical time

Surgical time (min) almost reached significance (p = 0.08) in favour of ALIF (n = 536 MD − 40.98 CI − 86.47, 4.52) compared with PLF, but heterogeneity was high (I2 = 94%) (Fig. 8).

Fig. 8
figure 8

Surgical time (min) stand-alone ALIF vs. PLF

Blood loss

Mean blood loss (ml) was not significantly different between the ALIF and PLF groups (two studies, n = 77, MD − 261.44, CI − 566.56, 43.6).

Length of stay

LOS (days) was not significantly different between ALIF and PLF (4 studies, n = 592, MD 0.90, CI − 1.47, 3.27).

Fusion rate

In six studies, fusion rates were not significantly different between ALIF (n = 219, RR 1.02, CI 0.94, 1.10) and PLF.

Patient-reported outcome measures

Visual analogue scale

VAS back pain scores at one year (n = 21, MD − 1.00, CI − 1.47, − 0.53) significantly favoured ALIF over PLF. VAS back pain scores at two years also favoured ALIF (2 studies, n = 67, MD − 1.39, CI − 1.67, − 1.11). However, the VAS leg pain scores (n = 46, MD 0.50, CI 0.12, 0.88) at two years significantly favoured PLF.

SF-36 Physical component score

ALIF patients showed significantly more improvement in the SF-36 Physical Component (PCS) score (n = 245, MD − 3.80, CI − 7.21, − 0.39) than PLF.

Oswestry disability index

ODI scores at one year were not significantly different between ALIF and PLF (2 studies, n = 265, MD − 1.17, CI − 7.76, 5.42). However, two year ODI scores (2 studies, n = 67, MD − 7.59, CI − 13.33 to − 1.85) significantly favoured ALIF over PLF, but heterogeneity was high (I2 = 70%).

Japanese orthopaedic association score

Leg pain (n = 46, MD − 0.20, CI − 0.61, 0.21) assessed with the Japanese Orthopaedic Association Score (JOAS) was not significantly different at two years between ALIF and PLF. Low back pain at one year (n = 21, MD − 0.50, CI − 0.78, − 0.22) and two years (n = 67, MD − 0.36, CI − 0.65, − 0.07) significantly favoured ALIF (Fig. 9).

Fig. 9
figure 9

Japanese orthopaedic association score (low back pain) stand-alone ALIF vs. PLF

Adverse events

Postoperative complications, the need for further surgery, severe and minor adverse events, and readmission rates were not significantly different between ALIF and PLF (Fig. 10).

Fig. 10
figure 10

Adverse event rates stand-alone ALIF vs. PLF

Discussion

The results support the hypothesis that stand-alone ALIF offers advantages over posterior approaches by minimising surgical time and blood loss [4]. Any fusion surgery is associated with significant intraoperative blood loss. This can be associated with longer operative times and higher rates of complications, which may increase the LOS and reoperation rates. This could cause a further financial burden for hospitals and healthcare systems. Factors such as surgical approach, operative time, and surgical complexity are essential considerations in perioperative planning, and every effort should be made to minimise blood loss and avoid transfusion [33].

The LOS was shorter for ALIF than TLIF but not for other posterior approaches. Spinal fusion is the most expensive operating room procedure in the USA [34, 35]. Diminishing LOS after lumbar fusion is a valuable step in curtailing healthcare costs.

The fusion rates were similar between the anterior and posterior approaches. The radiographic fusion rates reported in the literature for single-level surgeries vary between 70 and 96% [36]. There is yet to be a consensus on the method and timing of radiographic fusion assessment. This impedes the comparison of treatment outcomes. Additionally, the progression of bone formation over time and how this relates to bone grafting options necessitates more extended follow-up periods. Pseudoarthrosis may have deleterious effects on postoperative outcomes, including segmental instability, persistent or recurrent pain, and hardware failure.

The PROMs for disability and pain scores were equivocal when comparing ALIF with PLIF/TLIF. Back pain PROMs (VAS, ODI and SF-36 PCS) favoured ALIF over PLF. PROMs are affected by surgical and patient factors and fusion rates. Importantly, validated measures in spinal care are necessary for surgeons, patients, and policymakers to compare the effectiveness of different treatments in the same condition [37].

Adverse events did not differ significantly between the ALIF and posterior approaches, and the rates were generally low. A clinical evaluation by Chen et al. (2017) indicated that surgeons adequately recorded major adverse events. However, they often omitted recording minor events [38] enough, so it is possible that the results of this study underestimated minor adverse events. It is imperative to record and track adverse events and complications to implement programs to reduce the morbidity and mortality associated with spinal fusion. Adverse events may have a negative effect on PROMs and a substantial economic burden on health care.

This study had several limitations. Twenty studies used a non-randomised design which increased the risk of introducing confounding variables. For example, imbalances in the severity of pre-existing illnesses or comorbidities associated with poorer outcomes in spine surgery, such as smoking status, obesity, age, and socioeconomic status, could have affected surgical and PROMs. Such confounders could have been minimised by ensuring that baseline variables were restricted. Selection bias may have occurred during the choice of surgical approach, albeit due to surgeons making pragmatic decisions by the diagnosis, clinical and radiological features, or other considerations to maximise patient well-being.

This study could not analyse several potentially confounding variables explored due to a combination of heterogeneity in the techniques used and varied and limited levels of detail in technique descriptions within the articles assessed. Such considerations include diagnostic subtypes (e.g. degenerative versus isthmic spondylolisthesis), graft material (e.g. autograft, allograft, bone substitutes), interbody device type (e.g. PEEK versus Titanium alloy), fixation type (e.g. integrated screws versus plate-screw constructs), and surgical approach (e.g. open versus minimally invasive).

Limited data are available for PROMs, most findings being equivocal. This is partly secondary to the heterogeneity of the public data available to compare. Available PROMs only provide an analysis of short-term outcomes (1 to 2 years). More mid-and long-term data are needed to improve our understanding of the effects of the fusion approach on PROMs and ASD rates. The use of the best clinical evidence will act as the basis for guidelines for lumbar fusion. There remains variability in the standardisation of reporting of PROMs, adverse events, and specific diagnoses in the literature. In addition, there is a lack of uniformity in the surgical approaches used. Finally, continued efforts to develop higher-quality data from well-designed trials about surgical indications and techniques for lumbar fusion will provide valuable information in healthcare decision-making.

Conclusions

Overall, stand-alone-ALIF benefited from a shorter operative time and blood loss than (PLIF/TLIF) approach. PROMs were equivocal when compared to those of PLIF or TLIF. Back pain and disability scores favoured ALIF over PLF. Adverse events were also equivocal between the ALIF and posterior fusion approaches. Future studies must follow the CONSORT and STROBE guidelines [39] to help elucidate the effects of anterior and posterior fusion, particularly for PROMs.