Introduction

In vestibular schwannoma (VS) – a benign intracranial neoplasm located in the cerebellopontine angle (CPA), [1, 2] both, stereotactic radiosurgery (SRS) and microsurgical tumor resection (SURGERY) are valid options for choice of treatment [3,4,5]. Postinterventional facial functional deterioration (FFD), after SURGERY is especially pronounced in large VS, while SRS faired significantly better in respect to facial preservation [5]. However, it has previously been shown that long-term tumor control in large VS (Hannover T3-T4 / Koos III-IV) is significantly inferior in SRS compared to SURGERY [5]. This illustrates the particular challenges in clinical decision-making of large VS.

The ambivalence of treatment efficacy (reduction of tumor recurrence/progression) of SURGERY in light of its increased adverse effects (e.g. FFD) compared to SRS, needs to be illustrated in a well-rounded manner in order to translate clinical research results into clinical practice and hereby enable satisfactory patient consultation. Absolute risk reduction (ARR), absolute risk increase (ARI), and odds ratio are extensively used parameters to illustrate the benefit or disadvantage of one treatment over another [6]. However, in the context of clinical decision-making, it is also meaningful to use the measure of number needed to treat (NNT) [7, 8]. NNT is defined as the number of people needed to receive SURGERY instead of SRS to prevent an outcome over a defined time period. It has been widely used in scientific literature to communicate benefits of a treatment (e.g. medication or vaccination) and is used as an epidemiological measure for reporting treatment impact [9,10,11,12]. At the same time, treatment toxicity is reported by the equivalent number needed to harm (NNH). For risk–benefit analysis, the Likelihood-of-harm/help (LHH), calculated as the ratio of NNH to NNT, is able to illustrate trade-offs between harms and benefits of two treatments [8, 11].

However, this kind of measurement has not yet been translated to Neuro-Oncology yet. The largest branch of clinical neurooncological research focuses on high-grade glioma with its devastating prognosis and treatment comparative effectiveness analysis involving treatment toxicities remain in the background [13]. In light of benignancy of VS, the debate on different treatment modalities in VS management remains multi-faceted. The aim of this study was to illustrate the effectiveness in tumor control, to identify parameters that may indicate the effectiveness of either SURGERY or SRS in the therapy of large VS and to characterize operative benefits in terms of NNT, NNH and LHH by comparing both treatment modalities.

Methods

Study design and patient cohort

This was a retrospective dual-center cohort study. Study reporting followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines. Patients were identified by a prospectively kept registry. Previously treated VS, VS associated with Neurofibromatosis, patients with pre-interventional FFD, and small VS (Koos I-II) were excluded from this study. Data were then retrospectively collected between 2005 und 2011 from two tertiary and specialized centers involved in the treatment of VS for patients.

Data collection

Tumor size was classified by Hannover Classification [14, 15]. Clinical state was reported by House and Brackmann (H&B) [16] and Gardner-Robertson (G&R) scale (with H&B and G&R 1–2 considered to be good outcome) [17]. Recurrence-free-survival (RFS) was assessed radiographically by gadolinium-enhanced magnetic resonance imaging (MRI) [18, 19]. The criteria for tumor recurrence/progression was progredient growth in contrast-enhanced MRI (radiographic tumor control, RTC). To exclude the described phenomenon of pseudoprogression after SRS, patients with tumor volume (TV) increase 6 months after SRS with stable TV afterwards or TV decrease were not graded as VS recurrence/progression [20]. The TV was measured using slice-by-slice manual contouring. In case of SURGERY, extent of resection (EOR) was classified by first post-operative MRI (3 months postoperative): residual contrast-enhancing tumor was defined as subtotal resection (STR), whereas gross total resection (GTR) was defined as lack of contrast-enhancement in MRI. Due to the low number and for statistical purposes, the patients with subtotal resections were excluded. The local ethics committee approved this analysis, which was conducted according to the ethical standards laid down in the Declaration of Helsinki for research involving human subjects.

Treatment modalities

Patients treated by SURGERY were all operated on via the retrosigmoid approach using intraoperative electrophysiological monitoring in semi-sitting position under continuous echocardiography monitoring [21, 22]. All VS patients in the SRS cohort received Gamma-Knife-Radiosurgery (GKR – Elekta AB, Stockholm, Sweden) with a prescription dose of 13 Gy to the 65% isodose line [23].

Statistical analysis

Statistical analysis was performed in R Studio (Version 1.2) using descriptive statistics. Incidence of recurrence per patient-time was calculated as the following: quotient of number of recurrent events to number of patient days. This result was then shown per 1 million days. ARR was calculated as the difference between incidence rate of recurrence or a postoperative FFD in patients treated with SRS and SURGERY. The ARR was then used to derive an estimate of treatment effectiveness, which was the NNT, defined as 1 / ARR. If NNT is negative in undesirable outcomes (e.g., the occurrence of posttreatment long-term FFD), it is usually referred to as NNH. Concerning the occurrence of a recurrence, there was a positive NNT; thus, we refer to it as number needed to operate (NNO) throughout the manuscript. In contrast, SURGERY was a significant predictor for the occurrence of posttreatment FFD, resulting in a negative 1/ARR relationship (i.e., NNH). LHH was then calculated by a quotient of NNH / NNO to illustrate the risk–benefit ratio [10, 11]. As the less invasive treatment option, SRS was used as the standard therapy in all NNO, NNH, and LHH analyses, to which SURGERY was then compared by considering treatment benefits and harms.

To compare nonnumeric parameters of both groups, the chi-square test was applied. For numeric parameters, Welch’s two sample t-test was used. RFS was estimated using the Kaplan–Meier method and compared between cases and controls using a log-rank test. The length of follow-up for RFS was calculated from the date of surgical or radiosurgical intervention to the date of either recurrence or the last clinical visit. Significance was defined as the probability of a two-sided type 1 error being < 5% (p < 0.05). Data is presented as mean ± standard deviation (SD) if not indicated otherwise.

Results

From 2005 to 2011, 901 patients with primary and solitary VS were treated in both centers. Of those patients, n = 492 (55%) were classified as large VS (Hannover T3-T4) according to the Hannover-Classification and used as the main study cohort in this analysis. Patients with pre-operative FFD at HB > 1 (n = 31; 6%) were excluded due to the study design. From this study cohort (n = 460), n = 209 (45%) received SURGERY, while n = 251 (55%) received SRS. GTR was achieved in 95% (n = 198), while the rate for STR was at 5% (n = 11) with six subtotal and five near-total resection (for detailed analysis of STR subgroup, see supplementary material). The patient cohort flowchart is shown in Fig. 1A below. Mean patient age was significantly higher in the SRS subgroup compared to SURGERY (p < 0.001). Cystic morphology was more often present in the surgically treated (SURGERY) with p = 0.002. Tumor size was unequally distributed (see Table 1) with larger tumors more likely to be treated with SURGERY then SRS.

Fig. 1
figure 1

A Flowchart of patient cohort. B 10-Year Kaplan–Meier-Analysis for tumor-recurrence SRS versus SURGERY

Table 1 Patient demographics, tumor characteristics, incidence of shunt-dependency, recurrence, clinical presentation pre- and posttreatment, treatment complications and Clavien-dindo classification (CDC)

Preinterventional clinical parameters were similar in both groups. The rate of functional hearing at last follow-up was similar in both groups with 27% in SRS and 23% in SURGERY (p = 0.625). Tinnitus, trigeminal symptoms, and vertigo were significantly improved by SURGERY (Table 1). New-onset facial spasm was an SRS–specific event with an incidence of 5% in SRS. In the SRS cohort, 0.3% experienced a FFD. Of all patients treated with SURGERY, 30% experienced a relevant early postinterventional FFD (H&B > 2). However, of these patients, 69% improved after 1 year and at last follow-up (H&B < 2). Therefore, the rate of permanent FFD (H&B > 2) at last follow-up was 9% in SURGERY.

The rate of direct postoperative FFD (H&B > 2) was 22% in patients under 40 years old, 34% in patients 40–50 years old, 37% in patients 50–60 years old, and 29% in patients older than 60 years. Notably, 54% of patients under age 40 with a poor facial outcome (H&B 3–6) directly after surgery recovered to good facial function (H&B 1–2) at the last follow-up. In those aged 41–50, the rate of facial recovery was 76%; in those 51–60, it was 57%; and in those over 60, it was 80%. Treatment complications were rare and mainly classified as CDC [24] class 2 (i.e., medically treated vasospasm, venous thrombosis, or brain-edema), CDC 3a (i.e., nonsurgically treated CSF-fistula), or CDC 3b (i.e., surgically treated CSF fistulas, hemorrhages, hygroma, pneumocephalus, or hydrocephalus) (see Table 1).

In the present study cohort of large VS, the overall incidence of recurrence was 9%. The incidence of recurrence after respective monotherapy was significantly higher in SRS with 14% compared to SURGERY with 3% (see Fig. 1B). The incidence of recurrence of cystic VS (T3–T4) was 11%. Mean follow-up time was 79 (± 52.6) months in the whole study cohort, with 74 (± 52.7) months in SURGERY and 82 (± 52.2) months in SRS. Mean time to recurrence was longer in SURGERY with 102 (± 35.9) months compared to 57 (± 36.3) months in SRS (p = 0.007).

Tumor size affected tumor control after both treatment measures (SURGERY and SRS) (Fig. 2A). In line, the incidence of recurrence per one million person days was higher in SRS compared to SURGERY and depended on tumor size (Fig. 2B). SURGERY was able to reduce events of recurrence by 42 events per one million patient days (SRS: 55 recurrences per one million patient days versus SURGERY: 13 recurrences per one million patient days) (Table 2). In patients treated with SRS, the rate of recurrence was the highest in patients over 40 years of age with 20% compared to older patient subgroups (41–50 years = 11%; 51–60 years = 13%; and > 60 years = 15%).

Fig. 2
figure 2

A 10-Year Kaplan–Meier-Analysis for tumor-recurrence in different tumor sizes (Hannover T3a-T4b) in SRS versus SURGERY. B shows incidence per patient time according to tumor size

Table 2 Incidence of recurrence per patient time (per one million patient days), Number needed to operate, Number needed to harm, and Likelihood of Harm by patient and tumor characteristics

Comparative Kaplan–Meier analyses for SURGERY versus SRS depending on age are shown in Fig. 3A. In the SRS–treated group, the incidence of recurrences per patient time was the highest in those under 40 years old with 75 recurrences per 1 million patient-days and lowest in those aged 50–60 with 27. In the SURGERY group, tumor control illustrated per patient time was lower in the younger age groups (under 40 and 40–50) and higher in patients older then 50 and the elderly (Fig. 3B and Table 2).

Fig. 3
figure 3

A shows the 10-Year Kaplan–Meier-Analysis for tumor-recurrence in different age groups: < 40 years, 40–50 years, 50–60 years and > 60 years in SRS versus SURGERY. B shows the incidence of recurrence per million patient days according to patient age

The expression of difference in success/harm of both treatment arms are calculated and shown in Table 2. In the overall cohort, SRS presented with an incidence of recurrence of 14%, while SURGERY had a recurrence rate of 3%. Therefore, ARR was 11% (95%CI:6.0–15.8%), when treating patients with SURGERY instead of SRS. This yielded in a NNO of N = 10 (95CI: 6.3–16.6) – meaning that by treating 10 patients with SURGERY instead of SRS, one event of recurrence can be avoided. However, SURGERY increased the risk for FFD for 9% (95%CI:54.6%-12.7%) in SURGERY compared to SRS. FFD expressed by NNH was N = 12 (95CI:7.8–21.7). The overall LHH was therefore at 1.20. LHH was in favor of SURGERY in the following subgroups T3-T4a tumors, < 40 years of age and cystic VS (Table 2 and Fig. 4). The number of patients with T4b tumors were N = 10 in either group with each the same number of tumor recurrence (N = 1), and therefore the LLH-analysis was not applicable (Fig. 4).

Fig. 4
figure 4

Likelihood of harm according to patient age and tumor size is shown in A and B respectively

Discussion

Our study used data from a retrospective, dual–center study to report NNO/NNH depending on tumor and patients’ characteristics comparing SURGERY to SRS in large VS. In large VS (Hannover T3–T4), SURGERY was superior to SRS, considering tumor control, with an absolute risk reduction of 11% for the incidence of recurrence, resulting in an NNO of N = 10. In other words, one tumor recurrence can be prevented when N = 10 patients are treated by SURGERY instead of SRS. However, the absolute risk increase in FFD was 9% for SURGERY compared to SRS, yielding an NNH of N = 12. LLH was therefore 1.2 formally favoring SURGERY in large VS. LHH calculations indicated a benefit of SURGERY in T3 tumors, cystic VS, and young patients.

Age–related differences and incidences per patient-time

Mean patient age was significantly higher in SRS subgroup indicating a provided-care bias towards more conservative management in the older patient cohort – an effect often seen in comparative studies [25, 26]. After all, SRS is a less invasive treatment option with less treatment-related side effects compared to SURGERY [6]. Nevertheless, SURGERY in the elderly was previously shown to be safe and the postoperative functional results to be similar in the elderly compared to the young, even though premorbid status was worse [27,28,29].

The most remarkable age trend was observed when we analyzed the incidence of tumor progression after SRS as a monotherapy: Here, the incidence of progression rose from 11–12% to 20% in patients under the age of 40. This result was also reflected in the incidence per patient-days (74.95 events per 1 million patient-days). In the Kaplan–Meier analysis, SURGERY was superior to SRS in patients under the age of 50. A retrospective, multicenter study with 176 patients showed a 5–year progression–free survival rate of 90.9% and a 10–year progression–free survival rate of 86.7% with single–session SRS in patients under the age of 45 with large VS (Koos III–IV). However, the basis for these calculations was data with a median follow-up of 3 years [30]. In the interpretation of tumor–control data, special attention has to be paid to the follow-up time because mean time-to-recurrence has been reported to be longer than 5 years, and shorter follow-up time may overestimate tumor control [5]. The strength of our work lies in its long follow-up period of 79 months in mean; therefore, a cumulative follow-up time of 1′112′639 patient-days (36.338 patient-months).

There are many ways to demonstrate tumor control as an endpoint in retrospective studies, including the following: incidence of recurrence, Kaplan–Meier analysis, and 5–year risk of recurrence, among others [30]. However, these measurements fail to express differences in saved recurrence/progression-free patient time and over- (Kaplan–Meier) or underestimate (5–year risk of recurrence) the true incidence of risk. By calculating the incidence per patient time, one is not only able to demonstrate the patient-days saved but also show the recurrence/progression rate in the context of the time to surveillance in the subgroups because this may vary in different age groups due to non–VS–related drop-outs or deaths.

Risk–benefit-calculation

To the best of our knowledge, this is the first study to introduce the notion of NNO, NNH and LHH in neuro-oncological analysis. This model of analysis is highly reproducible and can be applied to any setting when assessment of treatment strategies is required and when balancing between the magnitude of the survival/recurrence advantage and side-effects is the goal [11]. LHH indeed has a strong visual impact, especially in the subgroup analysis of patient age and tumor sizes as shown in our analysis. LHH calculations in the present cohort indicated a benefit of SURGERY in T3 tumors, cystic VS and young patients.

Our study reported a low rate of relevant permanent facial palsy (9% in large VS) compared to that in the available literature, wherein this value varied largely between 14 and 66% [25, 26, 31,32,33,34,35,36]. We consider one of the strengths of this study that both centers were highly specialized in the treatment of VS with very high caseloads, so we were not comparing among different surgical techniques but truly intermodally between specialized SRS and SURGERY monotherapy [5]. Independent of the absolute values, the present analysis demonstrates—in contrast to the simplified perception in current literature [1]—that the risk–benefit analysis in VS does not unequivocally favor SRS. If we assume that tumor recurrence and FFD are equivalent in importance for patients’ onco-functional outcome, SURGERY—in this case, GTR—is justifiable even in light of the additional risk for FFD. Moreover, we demonstrated how NNO, NNH, and LHH varied depending on patients’ characteristics such as age, sex, tumor size, and tumor morphology. Our findings quantify the benefits of prioritizing SURGERY in large VS, particularly in T3a, T3b, T4a, cystic VS, and young VS patients.

Also, this study is intentionally designed to be thought-provoking: FFD versus tumor control are drastically placed side by side in this comparative methodology placing both aspects at equal importance. We deliberately chose to be radical in only including patients treated with GTR to compare the most aggressive surgical therapy (and therefore the treatment with the supposedly highest rate of facial morbidity) with the least-invasive and most functionally preservative treatment (SRS) to emphasize the analysis endpoints in both extremes (i.e. LLH) [1]. The fact that facial preservation rates vary largely between different academic neurosurgical centers, has resulted in a vivid discussion on intentional subtotal resection in VS to increase facial function preservation and therefore decrease risk of harm (NNH) [1, 25, 26, 31,32,33,34,35,36]. However, it has been shown that tumor control worsens with increasing residual tumor, and therefore we would assume that NNO for tumor control would also increase in lower EOR grades [37,38,39]. Its proportional effect on LLH has to be evaluated in the future on different EOR grades and combination therapy to better illuminate the question on the impact of EOR in VS management.

Strength and limitations of this study

This study is limited by its nature of retrospective design. Even though the number of patients in this study was rather large—especially compared to the existing literature on large VS— our analysis could be even more meaningful in larger epidemiological study groups. NNT or NNO is a measurement of overall effect sizes among a cohort, so its direct translation cannot be applied directly to an individual patient and affect an individual treatment decision [9]. However, because this study demonstrates how the measures change in different subgroups (e.g., tumor size, patients’ ages, etc.), these factors can influence in individual treatment choice and consultation.

Conclusions

In this study, ARI for facial palsy and ARR for incidence of recurrence were comparable at 11% and 9%, respectively, and yielded an LHH of 1.2. Independent of the absolute values, the present analysis demonstrates that the risk–benefit analysis in large VS does not unequivocally favor SRS. Still, large VS should be treated only in specialized centers, which have enough experience to ensure a high rate of facial preservation in large VS.