Introduction

Obstructive sleep apnea (OSA) affects a large proportion of the adult population. One study estimates that 23% of women and 49% of men suffer from OSA, defined as 15 or more apneas and hypopneas per hour (AHI) [1]. Several studies could demonstrate long-term complications of untreated OSA, such as cardiovascular disease, stroke, and increased mortality [2, 3]. Continuous positive airway pressure therapy is the main treatment option, but 19–49% of patients do not adhere to this treatment [4].

Several surgical techniques have been developed as alternative treatments. The most frequently performed surgery is uvulopalatopharyngoplasty with or without tonsillectomy (UPPP ± TE) [5]. In 1981, Fujita et al. first described the procedure, which was subsequently modified and refined [6]. In a randomized controlled trial, UPPP with TE proved superior in reducing AHI compared to no intervention [5]. Stuck et al. performed a meta-analysis and found a good treatment effect of UPPP ± TE on respiratory events and daytime sleepiness [7].

Palatine tonsil size is a well-established risk factor for OSA, and multiple studies have correlated tonsil grade with tonsil volume measured in tonsillectomy specimens [8,9,10]. However, the surgical outcome of UPPP ± TE depending on tonsil size and volume has not been comprehensively investigated, despite indications that it is an important predictor of successful surgery. In a previous pilot study, we demonstrated that tonsil size and volume are good predictors of postoperative reduction of respiratory events but not for snoring or daytime symptoms [11]. Matarredona et al. could partially replicate our findings for tonsil volume but failed to find an association between tonsil grade and successful outcomes [12]. Both studies included only a small cohort leaving many outcome parameters uninvestigated or statistically underpowered.

In this study, we extend our previous analysis on the impact of tonsil size and volume on the outcome of radiofrequency uvulopalatoplasty ± TE (rfUPP ± TE) on respiratory parameters and symptoms. The larger cohort with more detailed clinical examination parameters allows the investigation of other factors predicting the surgical outcome, which many authors have proposed [9, 13]. To our knowledge, this is the most extensive study investigating the effect of tonsil size, volume, and clinical predictors on the outcome of rfUPP ± TE.

Materials and methods

A retrospective analysis of all rfUPP ± TE surgeries at our institution between 2015 and 2021 was performed. Patients were offered rfUPP ± TE based on history, physical examination, and respiratory polygraphy. All severities of sleep-disordered breathing, including habitual snoring, were analyzed. We excluded patients with significant missing data, concomitant surgery other than nasal surgery, or aged less than 18 years. This study was conducted according to the Declaration of Helsinki and approved by the Swiss ethics committee (EKNZ 2021-02324). The reporting follows the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline [14].

Data collection

Before surgery, all patients underwent an extensive examination of the upper airway, summarized on a standardized report form. Tonsil size was graded using the system proposed by Brodsky and the tongue position according to Friedman [15, 16]. In the case of tonsil asymmetry, the larger side was used for clinical grading. Home sleep apnea testing and symptoms questionnaires were performed preoperatively and 3 months postoperatively. Daytime sleepiness was reported using the Epworth Sleepiness Scale (ESS), depression symptoms using the Beck Depression Inventory (BDI), insomnia symptoms using the Insomnia Severity Index (ISI), and snoring intensity (SI) on a visual analog scale from 0 to 10 [17,18,19]. We recorded adverse events and patients’ satisfaction with the questions of whether they would undergo surgery again or recommend it to a friend.

Surgery

Radiofrequency UPP ± TE was performed in general anesthesia with radiofrequency ablation of the soft palate and incision of the palatopharyngeal arch to release tension. Radiofrequency ablation was administered by inserting a bipolar electrode into the soft palate four to five times per side until a visible stiffening and contraction were achieved and deemed sufficient by the surgeon. Overly long uvulae (> 12 mm) were shortened. All patients with tonsils underwent a cold-steel tonsillectomy, irrespective of tonsil grade and OSA severity. Removing tonsils, if present, was performed to stabilize the lateral pharyngeal wall and increase upper airway volume. The volume of tonsillectomy specimens was measured using Archimedes’ principle of water displacement, and the sum of both tonsil volumes was used for statistical analysis.

Statistical analysis

Statistical analyses were performed in RStudio using R (R Project for Statistical Computing, Version 4.2.0) with the consultation of the Clinical Trial Unit Basel. Tonsil volume was compared between tonsil grades using linear and multiple regression to adjust for confounding factors. The primary endpoint for surgical outcomes was the AHI’s absolute and relative reduction. Secondary endpoints were the effects of rfUPP ± TE on oxygen desaturation index (ODI), ESS, and SI. The Sher criteria, with a postoperative AHI ≤ 20/h and ≥ 50% reduction from baseline, were used to define success for respiratory parameters [20]. Likewise, the resolution of daytime sleepiness was regarded as an ESS value ≤ 10 and a reduction of ≥ 50% from preoperative. Successful treatment of snoring was defined as a value ≤ 3 and a reduction ≥ 50% on a visual analog scale. Categorical variables were compared using Fisher’s exact or Chi-squared test with continuity correction. After testing for normality, continuous variables were compared using t test and ANOVA for normal or Wilcoxon and Kruskal–Wallis test for non-normal distributed variables. Multiple linear and logistic regressions were used to identify co-factors for score reduction and success. For patients lost to follow-up, baseline characteristics were compared to those included to investigate a systematic bias due to attrition. Statistical tests were performed two-sided, and P values < 0.05 were considered statistically significant.

Results

In total, 354 patients underwent palatal surgery for sleep-disordered breathing at our institution. Six patients were excluded for missing head and neck examination and 41 for missing tonsil volume documentation, leaving 307 patients included in this study and analyzed for the baseline characteristics. For the follow-up, we excluded one patient for concomitant hyoid suspension and 78 patients for missing follow-up data, leaving 228 patients to be analyzed postoperatively. The median follow-up was 100 days (interquartile range 90–135 days). Patient characteristics are summarized by tonsil grade in Table 1.

Table 1 Baseline characteristics of all patients and by tonsil grade given as mean and standard deviations in round brackets or median and interquartile range in square brackets

Tonsil volume and grade

Tonsil volume significantly increased with each tonsil grade (Fig. 1, Table 1). Tonsil volume was significantly higher in men than in women (6.2 ± 3.7 ml vs. 4.9 ± 2.9 ml, P = 0.02), decreased with age by 0.09 ml per year (95% CI 0.05–0.12 ml per year; P < 0.001), and increased with BMI by 0.10 ml per 1 kg/m2 (95% CI 0.001–0.20 ml; P = 0.05). No other patient characteristic significantly influenced tonsil volume. A linear regression model showed a tonsil volume increase of 2.5 ml (95% CI 2.14–2.92 ml; P < 0.001) for each grade. After adjusting for sex, age, and BMI, tonsil volume increased by 2.4 ml (95% CI 1.9–2.8 ml; P < 0.001) per tonsil grade.

Fig. 1
figure 1

Palatine tonsil volume increases with tonsil grade according to Brodsky. Each tonsil grade is given as a boxplot with the mean indicated by a diamond shape. After adjusting for sex, age, and BMI, tonsil volume increased by 2.4 ml (95% CI 1.9–2.8 ml; P < 0.001) per tonsil grade (n = 307)

Apnea–hypopnea index (AHI)

Preoperative AHI increased with tonsil grade and tonsil volume. Linear regression showed preoperative AHI to increase by 4.0/h (95% CI 1.5/h–6.4/h; P = 0.002) for every grade and by 0.8 (95% CI 0.2/h–1.3/h; P = 0.01) per ml of total tonsil volume (see Online Resource 1 and 2). Postoperative AHI values were not significantly influenced by tonsil volume or grade (P = 0.07 and 0.55, respectively). The absolute AHI reduction increased with volume by 1.3/h per mL (95% CI 0.7/h–1.9/h; P < 0.001) and tonsil grade by 5.4/h per grade (95% CI 2.8/h–8.0/h per grade; P < 0.001), see Fig. 2 and Table 2. This relationship was also significant for the relative AHI reduction increasing by 21.8 ± 7.9% per grade (P = 0.006) and 5.9 ± 1.8% per ml (P = 0.006). After controlling for age, BMI, and neck circumference, tonsil volume, and grade were still significantly associated with preoperative AHI and the reduction through surgery. The tonsil volume of responders, according to Sher, was significantly higher than that of non-responders (7.0 ± 4.1 mL vs. 5.2 ± 2.8 mL, P < 0.001, see Online Resource 3). From tonsil grade 0 to 4, the rate of AHI responders increased from 14% (1/7), 30% (18/60), 42% (48/113), 57% (24/42) to 83% (5/6), respectively (P = 0.008). The increased rate for responders by tonsil volume and grade are shown in the logistic models in Fig. 3. A comparison of response by preoperative OSA severity is given in Table 3.

Fig. 2
figure 2

Apnea–hypopnea reduction increases with palatine tonsil grade (A) and tonsil volume (B). Figure A displays a linear regression with a 95% confidence interval. A diamond shape indicates the mean in Figure B. The absolute AHI reduction increased with volume by 1.3/h per mL (95% CI 0.7/h–1.9/h; P < 0.001) and tonsil grade by 5.4/h per grade (95% CI 2.8/h–8.0/h per grade; P < 0.001) (n = 228)

Table 2 Comparison of pre- and postoperative sleep studies (n = 228)
Fig. 3
figure 3

Logistic model for responder according to Sher by tonsil volume (A) and tonsil grade (B) with 95% confidence interval, indicating an increased odds ratio for success with larger tonsils (n = 228)

Table 3 Response to surgery stratified by preoperative sleep apnea severity defined as no OSA (AHI < 5), mild OSA (AHI 5–15), moderate OSA (AHI 15–30), and severe OSA (AHI > 30) (n = 228)

Oxygen Desaturation Index (ODI)

Preoperative ODI increased by 4.0/h (95% CI 1.1/h–6.8/h; P = 0.006) per grade and by 0.8/h (95% CI 0.2/h–1.4/h; P = 0.01) per ml of total tonsil volume. Postoperative ODI was not significantly influenced by tonsil grade (P = 0.87) or tonsil volume (P = 0.43). The absolute ODI reduction increased for every tonsil grade by 5.1/h (95% CI 2.3/h–8.0/h; P < 0.001) and by 0.9/h (95% CI 0.3/h–1.5/h; P = 0.006) per ml of total tonsil volume. Similarly, the relative ODI reduction increased by 41.0 ± 18.6% (P = 0.03) and 5.6 ± 4.0% (P = 0.006) per grade and ml total volume, respectively.

Epworth Sleepiness Scale (ESS)

Excessive daytime sleepiness (ESS > 10) was reported by 35% (99/283) of patients before and 1.5% (2/135) after surgery. 60% fulfilled the ESS responder criteria. Patients who responded to treatment regarding ESS had significantly larger tonsils compared to non-responders, 6.5 ± 3.4 ml vs. 6.2 ± 3.8 ml (P < 0.001), albeit the absolute volume difference was small. Preoperative and postoperative ESS was not significantly different among tonsil grades (P = 0.12 and 0.19, respectively, Table 1). There was also no significant association of tonsil volume with pre- or postoperative ESS (P = 0.06 and 0.43, respectively). The absolute reduction in the ESS was neither correlated with tonsil grade (P = 0.06) nor tonsil volume (P = 0.44).

Snoring

Reported snoring intensity decreased from 8 [7–9] to 3 [2–4] (P < 0.001), with 54% of patients (61/112) fulfilling the responder criteria. No association was found between tonsil grade or volume and preoperative, postoperative, or reduction in snoring. However, responders for snoring had significantly larger tonsils with 6.5 ± 3.3 ml compared to 6.0 ± 2.7 ml for non-responders (P < 0.001).

Co-factors

We further explored the data for predictors of outcome.

Sex did not significantly affect the pre- and postoperative value nor changes in AHI, ODI, ESS, and snoring.

Age was positively correlated with pre- and postoperative AHI with an increase of 3.8 ± 0.1/h (adjusted R2 = 0.04, P < 0.001) and 3.6 ± 0.1/h (adjusted R2 = 0.07, P < 0.001) for every year, respectively. The reduction through surgery was not influenced by age (P = 0.54).

Neck circumference correlated positively with pre- (1.3 ± 0.4/h per cm, adjusted R2 = 0.06, P < 0.001) and postoperative (1.2 ± 0.3/h per cm, adjusted R2 = 0.07; P < 0.001) AHI, but not the AHI reduction (P = 0.63).

BMI positively correlated with pre- (adjusted R2 = 0.06; P < 0.001) and postoperative (adjusted R2 = 0.06; P < 0.001) AHI. The reduction through surgery did not depend on BMI (P = 0.54).

Positional OSA, as measured by the Cartwright index, was inversely correlated with preoperative AHI (− 4.2 ± 1.0, adjusted R2 = 0.08; P < 0.001) and AHI reduction (− 2.8 ± 0.9, adjusted R2 = 0.04; P < 0.003), without a significant correlation with postoperative AHI (P = 0.08).

Insomnia symptoms slightly improved after surgery without reaching significance. On the Insomnia Severity Index, patients reported 12.5 ± 6.4 and 7.7 ± 5.6, pre- and postoperatively (P = 0.11). For depression symptoms, there were no changes from pre- to postoperative.

Concomitant nasal surgery was performed in 97 of 228 patients. A subgroup analysis showed that nasal surgery did not influence outcomes regarding AHI, ODI, ESS, or snoring (see Online Resource 4).

Preoperative AHI increased with each Friedman tongue grade by 5.4 ± 1.5/h (P = 0.008). However, no significant relationship between AHI reduction and Friedman tongue position was found. No association of pre- or postoperative AHI nor with its reduction through surgery was found in all other head and neck examination findings, including Friedman Staging System of OSA, septal deviation and turbinate hypertrophy, tongue base hypertrophy, epiglottis configuration, soft palatal webbing, the distance of the soft palate to the posterior pharyngeal wall, uvula configuration, dental status, tooth position (Angle 1–3) or bruxism.

Adverse events and patient satisfaction

26% (60/228) of patients reported any adverse events after surgery. A foreign body sensation was the most frequent complaint reported by 38 patients, followed by difficulty swallowing in 20 patients. These symptoms were mainly temporary; only 26 patients reported symptoms after 3 months. 91% (109/120) of patients would undergo surgery again, and 94% (110/117) recommend it to a friend.

Lost to follow-up

Patients who were lost to follow-up were younger (39.9 ± 11.8 years vs. 46.2 ± 11.0 years, P < 0.001) and had significantly lower preoperative AHI (lost to follow-up 17.3 ± 17.3/h vs. followed 26.1 ± 18.8/h, P < 0.001). No significant difference between the lost to follow-up and followed patients was found regarding tonsil grade (P = 0.40), tonsil volume (followed 5.9 ± 3.6 vs. lost 6.5 ± 3.7, P = 0.07), sex (P = 0.55) or BMI (followed 28.9 ± 4.4 kg/m2 vs. lost 27.8 ± 5.0 kg/m2, P = 0.12).

Discussion

In this study, we expand our previous analysis of an earlier published cohort undergoing rfUPP ± TE in a larger patient group and more detailed patient characteristics [11]. The main differences to the previous study are that we included all severities of sleep-disordered breathing and patients with prior TE undergoing isolated rfUPP. The larger sample size eliminates previously underpowered analyses and allows for more robust conclusions.

We found a good correlation between intraoperatively measured tonsil volume and tonsil grade on the clinical examination, even after controlling for confounders. Tonsil volume was higher in men and increased BMI but decreased with age. The effects of age and BMI are minor but statistically significant. Compared to a previous analysis, the measured volume increases with each grade, indicating that tonsil volume can be well-predicted with clinical examination. The intramural part of the tonsils is not visible on clinical examination and might account for some of the observed variability. Mengi et al. could demonstrate a good correlation of tonsil volume with transcervical sonography measurements, indicating that sonography might aid in better estimating the tonsil volume than clinical examination alone [21].

Tonsil grade and volume are significantly associated with increased preoperative AHI and a greater reduction in AHI through rfUPP ± TE. Age, BMI, and neck circumference were positively correlated with pre- and postoperative AHI but did not influence the reduction through surgery. Even after controlling for these variables, tonsil volume and grade could predict preoperative AHI and AHI reduction. The results for ODI follow the same relationships with predictors as AHI. For daytime sleepiness and snoring, we found only a trend toward higher tonsil grades resulting in better outcomes of palatal surgery. Daytime sleepiness is more likely influenced by additional factors such as arousal threshold and disruption of sleep architecture and, therefore, subject to greater variability. Snoring perception is also subject to other factors such as loudness, regularity, throatiness, and assessment is mainly in the beholders’ ear. In our cohort, no other parameters in the history or physical examination predicted outcomes of AHI, daytime sleepiness, or snoring.

Outcome studies of UPPP ± TE rarely report tonsil grade or measure volume. Our results demonstrate that this is a crucial factor in predicting outcomes. The lack of reported tonsil size is a major limitation when comparing different techniques of palatal surgery or other interventions for sleep-disordered breathing. Without the knowledge of tonsil size, study results are not directly applicable to the individual patient. Several studies observed tonsil size to be correlated with preoperative OSA severity [8,9,10]. However, we could find only one study investigating the role of tonsil size on the outcome of UPPP ± TE, which describes a significant correlation between surgical outcomes and tonsil volume but not the tonsil grade [12]. A meta-analysis by Camacho et al. and a recent randomized clinical trial by Sundman et al. found that isolated TE without palatal surgery can effectively reduce AHI, indicating that the removal of tonsils might be a crucial part of a UPPP procedure [22, 23]. They also remark that few authors systematically remove smaller tonsils if they are still present, leading to biased outcome studies. A review by Maurer showed that success rates double if a tonsillectomy is performed in addition to a UPPP [24]. In the present study, all tonsils were removed irrespective of size and OSA severity, eliminating any bias toward positive results by removing only large tonsils. Several and sometimes conflicting predictors for success are reported in the literature, such as tongue position [13], Friedman staging of OSA [16, 25], age[13], and hyoid position.[25, 26] Besides the anatomical description of the upper airway, a neuromuscular component may influence outcomes but is hard to quantify and remains elusive [27]. Drug-induced sleep endoscopy offers a dynamic evaluation. A study by Chui et al. found a complete concentric collapse of the soft palate to be a negative predictor for outcomes of UPPP without TE [28]. Finding surgical success predictors is complicated by the large variability in preoperative factors and considerable night-to-night variability of AHI [29]. Interestingly, the AHI of some patients with absent or small tonsils deteriorated in our study. This might not be an actual deterioration but rather due to the night-to-night variability in patients with mostly low baseline AHI values [29]. This inherent variability in sleep medicine means that large cohort sizes are required for adequately powered statistical analyses, which are rarely found in surgical outcome studies. In our pilot study with 70 patients, the preoperative AHI increased by 1.4 s/h per ml of tonsil volume, whereas, in the present study, this was only 0.8/h per ml based on a much larger cohort. Moreover, tonsil volumes between grades I and II did not differ significantly in the pilot study, which is the case with the present cohort. In our opinion, the data set of this study allows for the most reliable assessment of the impact of tonsil size and volume on AHI, ODI, BMI, and age. Knowledge of the most decisive factors on the outcome of UPPP ± TE is crucial when counseling patients before this procedure. Thus far, tonsil size, if measured, is frequently among the strongest predictors of success [12, 13]. Interestingly, patients with absent tonsils undergoing an isolated rfUPP did not show a reduction in postoperative AHI, and some patients experienced a deterioration after surgery.

Our study has several limitations. The cohort consisted of predominantly male and middle-aged patients with BMI ranging from normal to moderately overweight. Confounders of sleep testing such as alcohol, sleep medication, nicotine, and caffeine intake were not directedly controlled. However, subjects were counseled to follow their habitual routines for every measurement to achieve comparable results. In all patients, radiofrequency ablation of the soft palate and incision of the palatopharyngeal arch was performed, limiting variability and allowing for a homogenous cohort. The results from the radiofrequency UPP might not be generalizable to other types of soft palate surgeries, including more conventional UPPP using suturing techniques. A recent randomized controlled trial showed radiofrequency UPP without TE and UPPP with TE were both effective in reducing daytime sleepiness and AHI. However, the AHI responder rate in moderate OSA was significantly lower for rfUPP without TE than conventional UPPP with TE (30% vs. 77%) [30]. We believe that tonsillectomy was the decisive factor for the difference in the outcome.

We included patients with nasal surgery in the analysis, since several studies have shown that nasal surgery does improve nasal obstruction but does not systematically affect OSA results [13, 31, 32]. Nasal surgery did not significantly affect surgical outcomes in our cohort. However, concomitant nasal surgery might introduce additional variability due to relatively unpredictable effects on sleep-disordered breathing.

A substantial proportion of patients was lost to follow-up. Patients lost to follow-up were younger and with a less severe OSA at baseline but were not different regarding tonsil size and volume from those included. Our results might be negatively biased, since patients with a resolution of their symptoms are hypothetically less likely to come for a follow-up visit. With a follow-up after 3 months, the current study cannot estimate the long-term results, which might deteriorate over time [7]. Overall, rfUPP ± TE improved respiratory parameters, snoring, and daytime sleepiness with high patient satisfaction and few adverse effects.

Conclusion

Intraoperatively measured tonsil volume strongly correlates with tonsil size on clinical examination. rfUPP ± TE is an effective surgical treatment for reducing respiratory parameters, snoring, and daytime symptoms. Tonsil size on clinical examination or intraoperatively measured volume are good predictors for postoperative AHI and ODI reduction. In contrast, relief of snoring and daytime sleepiness improves irrespective of removed tonsil volume. When counseling patients before UPPP ± TE, surgeons should know that tonsil size and volume are the most crucial factor for AHI reduction. However, they should also keep in mind that improvement of daytime sleepiness and snoring control, which might be even more important than AHI reduction for some patients, does not depend on the volume of removed tonsil tissue.