Introduction

Depression, anxiety, and low self-esteem can be significant psychological consequences of obesity [1]. Both physical and psychological consequences can be debilitating and decrease overall Health-Related Quality of Life (HRQoL). Bariatric surgery causes significant changes in patients’ life, including HRQoL. There is a growing body of literature demonstrating an improvement of both the physical and psychological status of the patients following bariatric procedures [2].

Objective assessment of QoL can be made using validated and standardized tools. A variety of those tools have gained popularity among researchers, but HRQoL assessment tools most commonly used in surgical literature are Short Form-36 (SF-36) [3], Moorehead-Ardelt Quality of Life [4], Gastrointestinal Quality of Life (GIQLI) [5], and Impact of Weight on Quality of Life (IWQOL) [6]. Although there are numerous studies on HRQoL after different bariatric procedures, the heterogeneity of different tools used and types of bariatric procedures make the interpretation of these data complex. Considering the paucity of literature on this topic, we aim to conduct a systematic review and a network meta-analysis to produce a comprehensive comparison of HRQoL after different bariatric procedures.

Methods

Search Strategy

A search was conducted by five teams, two researchers in each, in April 2020 covering Medical Literature Analysis and Retrieval System Online (MEDLINE), Excerpta Medica dataBASE (EMBASE), and Scopus database. There were no language limitations in the search. A full search strategy for the OVID platform is available in supplement files. This study was reported according to the Preferred Reporting Items for Systematic Reviews (PRISMA) guidelines network meta-analysis extension [7]. The protocol of this study was registered before commencement in the Prospective Register of Systematic Reviews (PROSPERO, CRD42019132975).

Eligibility Criteria

The analyzed population involved patients with severe obesity who underwent bariatric surgery. Studies were eligible for inclusion if they were randomized controlled trials (RCT) or non-randomized studies with a control group, such as cohort studies (prospective or retrospective). We decided to include non-randomized studies to increase the number of interventions that could be compared. Letters, editorials, case reports, case-series, and review papers were excluded. The included study had to comprise of at least two arms (one of which is bariatric surgery) and the follow-up period was 1 year, 2 years, 3 years, or 5 years. Published abstracts were not included due to limited information available for analysis and the risk of bias assessment. Studies must have reported on health-related HRQoL using any validated tools. The authors of primary studies were contacted in case of missing data.

Outcome Measures

The primary outcome of this systematic review was health-related quality of life (HRQoL) at 1 year, 2 years, 3 years, and 5 years after bariatric surgery. Secondary outcomes involved specific domains of HRQoL (vitality, physical functioning, bodily pain, general health perceptions, physical role functioning, emotional role functioning, social role functioning, and mental health). Tools which were used for the assessment included The Bariatric Quality of Life (BQL) [8], The Laval Questionnaire [9], GERD-Health-Related Quality of Life Questionnaire (GERD-HRQL) [10], World Health Organization Quality of Life Instruments (WHOQOL-BREF) [11], Short Form 8 Health Survey (SF-8) [12], Bariatric Analysis and Reporting Outcome System (BAROS) [13], The Moorehead-Ardelt Quality of Life Questionnaire (MA) [4], Gastrointestinal Quality of Life Index (GIQLI) [5], The Short Form 36 Health Survey (SF-36) [3], Obesity and Weight-Loss Quality of Life (OWLQOL) [14], Rand 36-Item Health Survey (RAND-36) [3], and The Impact of Weight on Quality of Life-Lite (IWQOL-Lite)[6].

Study Selection and Data Extraction

Each of the records downloaded from searches was screened by at least two researchers independently. All teams identified and selected citations first on the basis of titles and abstracts and then full texts. In case of disagreement, an attempt was made for reaching a consensus within the group. If no resolution was possible, an arbitrary decision was made by the third reviewer. Data from included studies were extracted independently by two researchers to a prepared Excel sheet. When available, the following data were extracted: first author, year of publication, country, number of operated patients, type of intervention, type of study, HRQoL form, and outcomes of interest (endpoint data). Whenever standard deviation was missing, it was derived as shown by Fu et al. using an average coefficient of variation [15].

Study Quality

Study quality was assessed by two researchers independently. Observational studies were evaluated using the Newcastle–Ottawa Scale (NOS), which consists of three domains: patient selections, comparability of the study groups, and the assessment of outcomes [16]. Randomized controlled trials were assessed using The Cochrane Collaboration’s Risk of bias tool [17].

Statistical Analysis

Statistical analysis was performed using WinBUGS 1.4 (BUGS project, MRC Biostatistics Unit, University of Cambridge). Network meta-analysis was conducted using Bayesian statistics according to Markov chain Monte Carlo methods. The model used for calculation was derived from generalized linear models for random effects presented by Dias et al. in Network Meta-Analysis for Decision Making and it is shown in Supplement File 2 [18]. To pool data from different HRQoL forms, standardized mean differences (SMD) were used and results on graphs are presented as SMD with 95% credible interval (CrI), while in tables the SMDs are converted to GIQLI scale [19]. Minimal clinically important differences (MCID) for GIQLI were considered at 5 points [20]. The model was run using three chains with an initial burn-in sampling of 10 000 per chain. Initial values for each chain were generated randomly. Statistical heterogeneity between the studies was assessed through the residual deviance of each model. Publication bias was assessed by visually inspecting the asymmetry of the funnel plot for analyses which included at least 10 studies.

Results

The initial reference search yielded 8892 records. After removing duplicates, 6346 titles and abstracts were reviewed and 484 papers were selected for full-text screening. Finally, 47 studies (17 RCTs and 30 non-RCTs) conducted in 17 countries were included in the network meta-analysis (Fig. 1). The studies included a total of 26,629 patients. A total of 11 surgical procedures were evaluated in the primary studies, which included laparoscopic sleeve gastrectomy (LSG, 25 studies), laparoscopic Roux-en-Y gastric bypass (LRYGB, 37 studies), laparoscopic biliopancreatic diversion with duodenal switch (BPD-DS, 6 studies), laparoscopic vertical banded gastroplasty (VBG, 1 study), laparoscopic adjustable gastric banding (LAGB, 8 studies), laparoscopic banded Roux-en-Y gastric bypass (banded-GB, 2 studies), laparoscopic greater curvature plication (LGCP, 1 study), laparoscopic distal Roux-en-Y gastric bypass (distal-GB, 2 studies), laparoscopic one anastomosis gastric bypass (OAGB, 5 studies), prolonged biliopancreatic limb gastric bypass (LB-GB, 1 study), and distal one anastomosis gastric bypass (distal-OAGB, 1 study). General characteristics of included studies are presented in Table 1 with NOS quality score or Cochrane risk of bias assessments. The analyses are presented separately for each of the pre-specified periods of follow-up: 1, 2, 3, and 5 years. The number of studies included in the analysis was decreasing with the length of follow-up, as was the number of procedures studied. Results for secondary outcomes are available in supplement files. The networks of studies for each follow-up period are presented in Fig. 2.

Fig. 1
figure 1

PRISMA flowchart

Table 1 Basic characteristics of the included studies
Fig. 2
figure 2

Study network in the meta-analysis A at 1 year, B at 2 years, C at 3 years, D at 5 years follow-up

Results at 1-Year Follow-up

Network meta-analysis was based on 22 studies (15 cohort studies and 7 RCTs) comparing 8 different surgical techniques (LSG, LRYGB, BPD-DS, VBG, LAGB, LGCP, OAGB, and distal-OAGB) (Fig. 3, Table 2). The analysis showed significant difference in HRQoL in favor of LSG, LRYGB, and OAG compared with lifestyle intervention (SMD: 0.44; 95% CrI 0.2 to 0.68 for LSG, SMD: 0.56; 95% CrI 0.31 to 0.8 for LRYGB, and SMD: 0.43; 95% CrI 0.06 to 0.8 for OAGB) and no significant effect for the remaining procedures. Pairwise comparisons showed a significant difference in HRQOL in favor of LRYGB vs. LSG (SMD: 0.11; 95% CrI 0.07 to 0.16), while VBG, LAGB, and distal-OAGB had significantly lower HRQoL than LSG and LRYGB; however, the difference between LRYGB and LSG was insignificant clinically (MCID < 5) (Table 3). In a detailed analysis of the physical aspect, apart from LAGB and LGCP, surgical interventions led to better HRQoL than lifestyle intervention (supplementary file 4). With regard to specific HRQoL domains, pairwise comparisons showed that LAGB was inferior to lifestyle intervention in physical domain and general health perceptions domain of HRQOL after 1 year, while LSG, LRYGB, BPD-DS, and OAGB were associated with better HRQoL in general health perception domain than control (supplementary file 4). Detailed information on the remaining aspects of the QoL is shown in supplementary files. Visual assessment of the funnel plot for LSG vs. LRYGB showed limited publication bias.

Fig. 3
figure 3

Pooled results of total HRQoL presented as SMD after 1 year a in comparison to lifestyle intervention; b pairwise comparisons between surgeries

Table 2 HRQoL after 1 year presented GIQLI scale (0–144)
Table 3 HRQoL after 2 years presented GIQLI scale (0–144)

Results at 2-Year Follow-up

Network meta-analysis was based on 15 studies (7 cohort studies and 8 RCTs), involving 8 bariatric procedures (LSG, LRYGB, BPD-DS, LAGB, banded-GB, distal-GB, OAGB, and LB-GB) (Fig. 4, Table 3). When compared with lifestyle intervention, only banded-GB and LB-GB had significantly better HRQoL at 2 years (SMD: 0.92; 95% CrI 0.3 to 1.52 for banded-GB and SMD: 0.89; 95% CrI 0.26 to 1.51). In pairwise comparisons, LRYGB was associated with better HRQoL than LSG, however, clinically not relevant (MCID < 5). Distal-GB was associated with worse HRQoL compared to standard LRYGB, whereas LB-GB and banded-GB modifications were associated with better. LAGB was associated with worse HRQoL than LSG, LRYGB, BPD-DS, banded-GB, and LB-GB. Detailed information on specific aspects of HRQoL is available in supplementary files.

Fig. 4
figure 4

Pooled results of total HRQoL presented as SMD at 2 years a in comparison to lifestyle intervention; b pairwise comparisons between surgeries

Results at 3-Year Follow-up

Network meta-analysis was based on 9 studies (6 cohort studies, 4 RCTs) involving 5 different surgical procedures (LSG, LRYGB, BPD-DS, LAGB, banded-GB). LSG, LRYGB, BPD-DS, and LAGB showed better HRQoL than lifestyle intervention (SMD: 0.9; 95% CrI 0.58 to 1.23, SMD: 0.96; 95% CrI 0.65 to 1.29, SMD: 1.16; 95% CrI 0.45 to 1.87, SMD: 0.78; 95% CrI 0.4 to 1.17, respectively, all with MCID > 5), while no significant differences were found for the remaining procedure in comparison with control (supplementary file 3).

Results at 5-Year Follow-up

Network meta-analysis was based on 7 studies (3 cohort studies, 4 RCTs) involving 4 different surgical procedures (LSG, LRYGB, BPD-DS, and OAGB). All interventions showed better HRQoL in comparison to control (SMD: 0.92; 95% CrI 0.58 to 1.26, SMD: 1.27; 95% CrI 0.94 to 1.61, SMD: 1.43; 95% CrI 1 to 1.87, and SMD: 1.01; 95% CrI 0.63 to 1.4, respectively) . Pairwise comparisons showed that both LRYGB and BPD-DS had better HRQoL than LSG and OAGB, with no difference between LRYGB and BPD-DS (supplementary file 3).

Discussion

To our knowledge, this network meta-analysis is the first to attempt to summarize and compare HRQoL after different bariatric procedures in patients with severe obesity. In total, we included 47 studies with 26,629 patients and 11 different surgical techniques covering the follow-up period from 1 to 5 years. Our analysis included both RCTs and observational studies to assess a wide number of different interventions as possible. Short-term results (1 year) showed that only LSG, LRYGB, and OAGB offer better QoL in comparison to non-surgical interventions, with LRYGB showing better results than LSG. Pairwise comparisons showed that LAGB and VBG result in worse HRQoL in comparison to LSG or LRYGB. Medium-term results (2- years) showed that patients who received banded-GB reported better HRQoL improvement than non-surgical patients, while LAGB resulted in worse results than other techniques, excluding OAGB and distal-GB. Long-term results (3 and 5 years) showed that LRYGB and LSG maintain HRQoL after surgery. BPD-DS at 5 years showed significant improvement to control, which is in contrast to previous years.

Previous network meta-analysis by Park et al. compared weight loss and remission of comorbidities following various bariatric procedures; however, this study did not explore the effects of those interventions on HRQoL [65]. The only meta-analysis which focused on HRQoL after bariatric surgery contained only pairwise comparisons and compared bariatric with non-bariatric patients showing an improvement in the HRQoL, mainly in the domain of physical functioning and activity [66]. Previous pairwise comparison by Hu et al. showed no statistically significant differences in HRQoL between LSG and LRYGB [67]. In general, LSG and LRYGB are the most commonly performed types of surgery worldwide, which is also represented in the number of studies in our review comparing these two techniques. Even though both techniques are well established and have been performed for several years, the debate on which method is better is ongoing, with both pros and cons for each one. In recent years, this resulted in conducting several RCTs, including the SLEEVEPASS (5-year results) and SM-BOSS (5-year results) showing no differences in HRQoL between those two procedures [54, 56, 68]. Our analysis found that both LSG and LRYGB in the long term (3–5 years) are associated with better HRQoL than no surgical intervention. Pairwise comparisons show that LAGB and VBG will likely result in worse HRQoL than other techniques, which is consistent with previous literature [69]. The present study suggests that LAGB may worsen HRQoL in comparison to LSG, LRYGB, or BPD-DS. Finally, with the current evidence, it is unclear whether BPD-DS in short term either improves or worsens HRQoL after the surgery. This may be associated with malnutrition and a more demanding diet than other techniques [70]. Results from a 5-year follow-up demonstrate that surgical interventions such as LSG, LRYGB, BPD-DS, and OAGB provide better HRQoL as compared to non-surgical methods.

One of the advantages of this network meta-analysis is the comparison of different variations of gastric bypass. In general, the most commonly performed bypass is LRYGB and OAGB. In our review, we compared other less common versions such as banded gastric bypass (elastic band placed on the pouch), distal gastric bypass (long alimentary limb), long biliopancreatic limb gastric bypass, and distal-OAGB (long biliopancreatic limb). Our results show that banded-GB and LB-GB sometimes are associated with better HRQoL than standard LRYGB, whereas distal-GB did not improve HRQoL. A systematic review by Shoar et al. showed that although banded-GB is associated with higher weight loss, it came with the expense of a higher incidence of food intolerance and postoperative vomiting, which can impact HRQoL [71]. The weight loss outcomes are similar for LRYGB and LB-GB. However, at 2 years, LB-GB was associated with better HRQoL than LRYGB [55, 72,73,74]. OAGB is still a controversial method with limited literature on long-term effectiveness [75,76,77,78]. The main modifications of this technique include a different length of the biliopancreatic limb [60]. The analysis of pooled data in the review showed that variation with the longer biliopancreatic limb in OAGB is associated with worse HRQoL in comparison to standard OAGB.

BPD-DS as a technique that alters the gastrointestinal tract in the greatest fashion requires more time for patients to adjust to new dietary patterns or the need for proper vitamin supplementation. Strain’s et al. study demonstrated that patients’ HRQoL improves after BPD-DS in the long run (9-year follow-up) [79]. BPD-DS in our analysis showed better HRQoL improvement than LSG.

This systematic review is the first comprehensive analysis of the impact of different bariatric procedures on HRQoL. Although a multi-arm RCT would be a preferable choice to establish which method is better, the task at hand would be very difficult to perform as most bariatric surgeons do not perform limited types of bariatric procedures in their elective practice. This network meta-analysis demonstrates that the most commonly performed surgeries, such as LSG and LRYGB, are associated with better HRQoL. It also demonstrates that some novel techniques, such as LB-GB, are worth investigating to a greater extent, whereas others (distal-OAGB) are less promising from an HRQoL standpoint. Finally, this meta-analysis showed that LAGB is associated with worse QoL, which cements LAGB as the least favorable bariatric procedure in all aspects.

Limitations

The main limitation of this study is the heterogeneity between included papers, such as different study designs including RCT and non-RCT, not homogenous population in terms of BMI or comorbidities, and different HRQoL instruments used. We decided to include non-RCTs, such as cohort studies to enable comparing as many different interventions as possible. We used the standardized mean difference to combine the results from different instruments, but to make the results more friendly we converted them into the GIQLI scale, which is widely used in bariatric surgery studies. The number of studies and number of compared interventions decreased with the longevity of follow-up. Another factor that needs to be considered is the underrepresentation of some of the procedures such as distal LRYGB, whereas LSG and LRYGB are the most common procedures analyzed in this review. Another limitation of the study is the quality of the included studies, although observational studies were considered to be of moderate quality in general, their design is associated with lower confidence in estimates as compared with RCT, while the majority of included RCTs was of high risk of bias. We have not searched for unpublished studies, this may have introduced potential publication bias. Formal testing of pub bias was not feasible due to the low number of studies for many comparisons. Nonetheless, this is a unique comparison of the different bariatric procedures and some compromises were required to achieve it.

Conclusion

This is the first network meta-analysis comparing HRQoL after different bariatric procedures. It demonstrates LSG and LRYGB may lead to better HRQoL across most follow-up time points. Long-term analysis shows that bariatric intervention results in better HRQoL than non-surgical interventions. Our analysis indicates that some procedures such as VBG or LAGB may lead to worse HRQoL. Future studies comparing different types of bariatric procedures should include HRQoL-related measures to their list of outcomes besides weight loss, comorbidities, and complications to provide a holistic perspective of each procedure.